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ABSTRACT 
The technique af formal  abstractian provides ar 
appropriate tool for specifying arn interface between layers 
af computer hardware and software. Ar abstract machine called 
AM has been built to address the problem of portability ard 
reusabllity of Software, This thesis is the design ard 
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I. INTRODUCTION 


In today’s computer world, paortability is a well-known 
problem which arises ina variety of situations. Since 
computer saftware evalves in connection with a particular 
hardware  ernviraonmernt, and often assumes features closely 
related ta һанасы c em af its awn hardware, this problem 
has been unavoidable. 

Formalizing the relationship betweer hardware and 
Software resources was treated is a previous NFS thesis by 
Yurchak (Ref. 11, whose efforts resulted in the specification 
and implementation of an abstract machine, called AM. 

The abstraction af a bit mapped display resource was 
added ta AM in another NFS thesis by Hunter. (Ref. 21 

Finally, ar abstraction of a formally specified reusable 
database was added ta the same machine by Zang. (Ref. 3] 

This  presertaticr is a further extension af the work 
started by Yurchak and Hunter: fn abstract computer and its 
programming environment. Its major objective is a compiler 


far a subset of the C language for AM. 


A. THE PORTABILITY PROBLEM 

It is well-known that moving large programs from ane 
machine to another is frustrating work. And it is also known 
that ance the software has been moved ta the new machine, iL 


15 mot predictable whether or ret it will work as before. 


Even if it seems to work, it may consume more resaurces than 
expected. 

Far a couple о? reasons, the pertability problem is 
getting worse, moat better: 


- Computer architectures have been changed ta make them 
lock like what the programmer warts 


— The riumber mf the devices included ir mocderri 
architectures has been maximized 


— Both languages and machines are related ta the data they 
manipulate ir an implementation dependent way 


These ard other factors make the portability problem a 
dirficult task, ard in addition, they affect same other 
dao ficult issues like language design and software 


engineering. 


B. CURRENT IMPLEMENTATIONS TO SOLVE FORTABILITY FROBLEM 

The usage «af high level languages provides a degree от 
high level abstracticr, and provides some measure af saftware 
standardization and portability. But the portability af high 
level languages is limited, since all the layers of Software 
belaw this high level have ta be moved, in order ta part such 
a system. 

There are other abstraction levels between the  camputer 
hardware and the applicaticm environments. Especially 
mperating systems represent a Software abstraction ipt: 
physical resources, and support the layers af saftware Duilt 
ver this level. Starting with CP/M and UNIX, we have seen 


some good  implemerntaticrns that provide such an abstract 


level ta same degree. The main idea af the AM machine is to 
abstract ard formally define other physical resources found 


im typical computing systems. 


II. ABSTRACT MACHINE, AM 


The Abstract Machine (AM) is a result af Yurchak CRef. 11 
and Hunter?” s LRef. cl efforts ta salve the problem af 
formalizing the relationship between hardware and software 
resources. INE is implemented as a finite state machine 
interpreter, with an assembler. Details of the newest version 
of the AM assembler can be found in Zang’s (Ref. 3] thesis. 

"Abstraction" describes the separation af the defining 
properties of an object from other, unnecessary details about 
TE: A programmer is primarily concerned with salving a 
problem. Appropriately, the tuxls at his disposal, such as 
programming languages, development aids, and the pragramming 
ernvirarmernt, form a problem solving abstraction. The hardware 
(and same of the saftware) on which this problem salving 
abstraction is implemented, however, is an abstracticn af a 
different sort. 

The fuzzy area between software and physical resource 
abstractions, sometimes simplistically perceived as the 
baundary between hardware and scaftware, exposes a number of 
shortcomings in language design and computer architecture 
collectively termed the "semantic gap". 

Narrowing the semantic gap requires significant changes 
in the fundamentals af Computer architecture and language 
design. Three major factors which significantly contribute ta 


this problem are: 


— Informally described-semantics; 
— Representation dependerit data types; 
- Arbitrarily designed instruction set architectures. 

The AM was designed to fill this semantic gap by 
addressing the above problems. LRef. 11. 

In the AM implemerntaticn, a text file represerting arı 
assembly larguage program is translated by the assembler inta 
a relocatable object module. A loader, part of the AM 
interpreter, Loade this object module into the appropriate 
cells, and AM executes it. 

The follewing presentation is an implementation of a 
subset af the high level language "C", for that abstract 
machine. It is a compiler which compiles C source code and 


generates assembler source cade for the AM. 


114 


III. DISCUSSION OF "C" SUBSET 


Since commercially good compilers are very large programs 
and it takes on the average six man-years tu write one of 
them, this research work had to be a small subset cof the C 
language. 

The goal was to write a small portion of C in the C 
language itself, and then by feeding the output of this work 
into itself, to create a native code C compiler. 

Since this work was going to be a race against time, the 
subset had tu be as small as possible, but or the other side, 
had Co be large enough to be able to campile its owr scurce 
cade. 

The sub-ncal vas to use a strictly limited number af 
features to write ees compiler, because any new feature used 
ir the Code would require implementation af the same feature 
in the compiler. 

The outcome cf this work was ret sephisticated erimunh te 
compile itself. It evolved asa small subset af the C 
pregramming language, sū called tay E". And since it was 
not sufficient to compile its own cade, it is used as a 
Ccross-compiler from hast MS-DOS camputers to the target 


machine AM. 


E SIDES SUBSET 

Tiny € is a small subset af C, ard a thesis Proyect 
more than a Language, There are many features which a real 
programming language has ta have, but Tiny-C does not. 
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The Tiny-C compiler was written in five months and is 
considered ta have the fundamental structure of a real 
compiler. Hopefully it will be modified and improved irn the 
future, and may be usable for real applications. 

Appendix Ais a listing uf the Tiny-C language grammar. 
But this grammar is  obvicusly not the complete ә 
language. At least: 

- Structure and union specifiers are not included. 

= Functions are rat allowed to return addresses. 

- Assigrmerts inside the expresslurs are nat allowed, 
Decause they were considered as making programs 
"unreadable". Far instance: 

"if ( (joem jimmy+5) ) 9S )" is not allowed in Tiny-C. 

= Multiple assignments are not implemented. For instance: 
“joe= jimmy = 15 * marys" is an invalid statement ir 


Tas 


B. THE TINY-C COMPILER 

Even though the Tiry-C language subset was plarred within 
the limits in this thesis, the Tiny-C compiler can only 
compile and generate code for an even smaller subset of the 
above grammar. 

The  Tiny-C compiler implementation can parse the whole 
Tiny- subset ard give proper errar messages if mecassary. 
Eut, 

— due to the time constraints, and 


— due to the restricted capabilities of the target AM 
machine 


— 
fu 


the Tiny-C campiler canmat generate code for the whale Tiny-C 


language. 
Irn the Tiny- compiler; 


— Flaating point arithmetic is nat implemented. Because it 
is nat supported by AM. 


— Bitwise and shift expressions are not implemented, since 
they are rot supported by AM. 


— Since AM has strictly defined data types and dees mat 
allow type caonversicns, address, pointer and array types 
are not implemented. 


— Since AM is designed as an operating system independerit 
saftware machine, the "*include" preprecessar 15 rat 
implemented. 


7:.1nce AM Gay ES yet, external 
declarations are not implemented. 


= Auto, Static; register, boolean. types are Yunt 
implemented. 


pa 
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Lara. T GIN 


This chapter describes the Tiny-C compiler step by step. 
But obviously the purpose of this presentation is met to 
teach the "compiler writing art", or ta explain the target 
Abstract Machine’s assembler. Complete documentation far the 
AM Assembler can be found in Yurchak’s thesis CRef. 1]. Fear a 
better understanding af the following structures, Ullmann’ s 
"Compilers, Techniques and Teals" [Ref. 4] is recommended as 
a background reference for compiler writing. 

The Tiny-C Cempiler is written in nine steps. These are: 

= S€anner or Lexical Analyzer 

— Grammar | 

— Recursive Descert Farser with Hacktracking 

— Data Structures for the Parser 

~ Error Checking and Error Messages 

— Emisslan af Intermediate Cade 

— Intermediate Code Optimizatian 

— Data Structures for the Cade Generatar 

— Target Code Generation 

We will first go through these steps briefly in arder tu 


get acquainted with the architecture af the Tiny-C campiler. 


A. SCANNER AND LEXICAL ANALYZER 
In general, Scanners and lexical analyzers are language 
independent structures. The same scanner may be used for a 


couple uf different compilers. Far this reason we will 
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introduce this structure even before discussing the Tiry-C 
grammar. 

Cantrary ta the header of this secticn, mny C dires rat 
have a scanner ör lexical analyzer in the classical sense. 

Even though the mast common way of writing compilers is 
analyzing the input data stream lexically, and after 
takenizirng, passing tokens ta the parser as they are needed, 
this was nat the way scanning was implemented in this 
compiler. The Tiny-C scanner is made up af a couple af 
roat ines nsed by a recursive descent parser with a 


backtracking tool. There is na takenized data stream. 


The idea is ta read the input stream into a scanner 
buffer, (which is implemented as a ring buffer) and parse 
it there. This technique gives an ability ta backtrack and 


makes it possible to write a very simple recursive descent 
tap-dawn parser. With such a Dacktracking tacl, the grammar 
does rot need ta be massaged ta a fully LL(1) grammar, that 
is ever if it 185 ambiguous in the LL(1) sense. In any 
ambigğumus case, the parser can try all possible options by 
bactracking. 

Let's start By ss yir rici ei TE cur scanner buffer 
and its initialization. 


init_buf() /* initialize scarmer buffer */ 
< 

Reads input source file inte scanner buffer. Sets the 
pointers for the current place (for initializing procedure, 
it is simply the begining of the scanner buffer) and for the 
very last character in the scarmer buffer. 
9: 


The scarıren may car may rat read the whale input stream 
at once, because its ring buffer has a limited size. Naw the 
next question 15 how to net a character fram this buffer 
(since tokens are rat used, we have ta deal with 
characters), ard if it is the end of Line e e how ta 


read some more input into this ring buffer. 


char gatchr () /* get character rautine x/ 
{ 

Gets the next character fram scanner buffer, and loads 
ii into glabal “next Cine s If it reaches the current end of 
the scarmer buffer, it reads some mare text from the source 
into scanner buffer. If it meets the eund af file character, 
it sets the "file erd" flag TRUE. 
y 

We even can put a character back into the buffer, sË 
needed. 
ungetchr () /* ur-get character */ 
£ 


Puts a giver character back irto scarmer buffer. 
A 

After initializing the scanner buffer, we cam get as 
many characters from there as we want ta. But parsers are 


higher level concepts, and they shenldn’t deal with the law 


level structures af scanning like getting three Mare 
characters от putting back one. Farsers mostly wark om 
takers. If we had a pure takerized implementatidrı, we could 


simply pop a token number from the scarmer buffer. But here 
we need semething ta give takens ta the parser. Alsa white 


characters and comments should be ignered. 


String takens are given to the parser by the follawing 


routine. 


matchtoken (str, whtchk) /* match ta a given string token x/ 
char ers /* string taker 27 

whtchks /* bcoalean variable far white chr. check */ 
{ 

This rautine attemps ta read the string taken Emi 
Scanner buffer. A fallawirg white character or delimiter is 
opt imal, ard this decision is made by the caller, namely 
parser. 57 e ke Wed RI if the taken matches (and a white 
character, aptiornally), else returns FALSE. In case af 
FALSE, it backtracks in the scarmer buffer to its  previcus 
place. 

D 
The following routine attempts ta match a single cha- 


racter in the scanner buffer and returns a boolean result. 


match (chr) /* match tà a single character */ 
char marş /* character ta match */ 
£ v 
delwht (); 
/* if character matches, return TRUE */ 
if (nextch==chr) 
{ 


nextch=getchr () ; 
ret urri (TRUE) ; 

} 

retiurr (FALSE) $ 


Rath af these routines delete white characters first. 
Ard in case af FALSE, they da nat backtrack ta their 
previous places exactly, atherwise the following routines 
have to skip white characters one mare time. So, in the case 
af the FALSE cr "unmatched" case, they backtrack ta the very 


first character which comes just after the white ones. 


ES 


daluht () /* delete white characters */ 
{ 


Used by bath match-character and match—taker roatines and 
skips all the following white characters (blank, tab, 
Carriage return and line feed characters) and the comments in 
the scarmer buffer. 

+ 
D. GRAMMAR 

Since there is rat a standard C language grammar, we had 
ta first write a grammar tao parse. The Tiny-C subset was 
discussed in the previcus chapter, and its complete grammar 
is presented in Appendix A. 


In this grammar (Appendix A), any terminal or meri 


terminal followed by a "si character means "none ar mene," 


followed by a "+" character means "one or more," and 
followed by a "?" character means "optional" or "none IS 
ane." Under these definitions for example: 


program: 
(pre-precessor)* (data-definition)* (function-definition)+ 


The rnem-terminal (pragram> goes tea any number of (pre- 
precessar), followed by any number af «(data-definiticar ar 
followed by one or more (furnction-definitiorn). 

The ? 1? character mears "ar", Fear example: 
pre-precessor: 

"define" (File-definitian) | 
“#include" (file-definitian) 

Thus, (pre—precessar) goes ta  "Hdefine'" fallowed 
Dy (File-definit ten] an, "#include" followed by (fila 
definition. 

Liem character means "allowed at mast ance. " mir 


example: 
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switch-statement: 
"switch"  "("  (arithmetic-expression) Unun” 
(case-stmt)* ">" 
case-stmt: 
"ease" | "default"! (constant-expression) 5. 
(statenent) * 


Thus, (switch-statemert) can gü ta "default" at mest 


DCE. 


C. FARSER 


A very simple farm af a working parser is presented in 


Appendix K. VE is a recursive descent parser but with a 
backtracking feature. There is a aone-tao-corne correspondence 
betweer riari-termirnal names irn the grammar and Funct icr 
names irn the parser. The reader 1S encouraged ta read the 


parser with an eye on the grammar. With the grammar’s help, 


it is ruat difficult te understand the structure af the 


parser. 

In this first version af the Tiry-C parser, all 
functions backtrack if they fail. In the real Tiny-C 
environment this is extremely u"ririecessamnmy, because iri 


the Tiny-C grammar, ambiguity exists ina few places only. 


The reasen this first version 15 presented in Appendix EB is 


its clarity arid simplicity: In the following versions, 
innmecessary backtracks Nave been taken out, 
In all the reutines in the parser, there are tu 


backtracking tools. First, the "aldp" eld pointer points tao 
the parser’s previous place in the scanner buffer, and 


second, the "line ono" line number keeps track of the current 
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line number far errar checking purposes. If a function fails, 
these rautines backtrack ta their previous states and try ta 
find another legal path ta parse. 

Appendix C has the  rautires far the basic rintermirals 
ande tenninals Ne pecu Só, 1: presents a 


working version af that parser with Appendix B. 


D. DATA STRUCTURES FOR THE FARSER 

Ni is the time ta introduce same data structures to 
improve the  Tiny-C parser. The first one is going to be a 
name string structure since all the following tables need 


this structure. 


1 Name String Implementation 


A name string is basically a big character array (am 

a string) which halds all the rames used in the Source file. 
Tiny-C has twa routines Co implemert this structure: 

The first one is used ta add a new name inta the name 


string, and the second one is used te lack for a given name. 


add name? /* add a name inte the name string */ 
{ 

Adds a riew rame irto the name string from the 
"id mame" glabal variable. The "id name" variable halds the 
Current identifier mame all the time. The T'urictox 
"identifier" in the parser sets this variable whenever it 
parses am identifier. 
Br 
find mame) /* find a name in the name string #/ 
4 

Locke Рок "14 mame" in the name string. 1783. rannd; it 


loads the  identifier's address inte a pointer and returns 
TREE else returrs FALSE. 
» 


Tu 
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In the current version af ae. the name string was 
implemented ceampletely sequentially. Instead, there could 
have beer a hashing mechanism, which would be much mare 
efficient. When testing the whele compiler, it was cbserved 
that a large number af predefined constant names and 
variables was making execution slow. 

— Constant Table 

Constants are implicitly declared elements. Iri 
the Tiny-C compiler, a constant table is implemented tà take 
care af them. Since every mccurence af a Constant denates 
the same declaraticn, we da mat need ta check If a constant 
accunrs mene than cance. We simply add each cemstant into the 


cametant table as it occurs. 
© 


add mum) /* add an integer number inte cemstant table */ 
2 

Adds arı integer numeric value inte the canstant table 
nü ie. nct ir there. Ard returns its address in a 
porrmver. 
+ 

In the current version af AM, integers are the orly 


nhumeric type. Şa it is the anly rumeric type inplemented in 
Tiny-C, and is the emly constant denataticn required. 

Since input data is an integer far the abeve routine, 
and since scurce file is read as character stream from the 
scarmer buffer, we need a string-ta-numeric conversion 


ralitine, ta canvert text input inte numeric values. 


fà 
pa 


str numi) /* string ta numeric */ 


4 

Takes a string "num mame" (rmimeric rame) which 1s 
set by the "canstant() " routine in the parser, calculates 
its üumeric value, and “returns it irn the “rum ensL; 
(numeric constant) global as an integer. 
> 


e Definition Table 
im lam C; the preprecessar cammand "#defirne" lets 
us define constant identifiers. izi a definition table is 
implemented for these identifiers. 

In case of a “#tdefine" declaration, we need ta add 
a new canstant identifier inte the definition table. 
add_criid() /* add constant identifier Wi 
{ 

Finst, checks if the given id-name is already in the 
definition table. If sa it gives an error, since definition 
af the same caonstant-id mare than ance is nonsense. Otherwise 
it nadds that given constant identifier into the definitimr 
table. 

} 
The next problem ain implementing constant identifiers 


is Finding the correspending values for these canstant a 


names, if they are met when parsing a program. 


firid_criid() /* find constant identifier */ 
{ 

Takes a constant identifier name and locks for it 
in the definition table. If found, it sets a pointer ta 


its place in the definitimr table and returns TRUE, else it 
returns FALSE. 
} 


4. Scopirg Rule 
İv classical compilers, symbal tables are primarily 


responsible Tor establishing the scoping rules. The Tim e 


Го 
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compiler salves the scoping problem ir: a different way. 


gur Tiny-C compiler has a variable string which holds 
all valid variable names in the current scope. When the 
parser starts parsing anew function or a new campound 


statement, (namely a new "black" in block structured language 


literature), the parser puts a mark into the variable string 


ta define the beginning af the new black, and adds the 
fallawirıg variable declarations into the same string. 
Whenever the parser goes out af a black, it deletes the 
very last bicock's variables frem this string. (Sirice the 


Tiny-C compiler is a cone-pass compiler, the deletion af the 
variables for the last block is acceptable in this case). Do, 
any time a variable is used, the compiler lacks for this 
variable in the variable string, “stan cheşerd tc 
the beginning. DT GL. it finds a painter ta the symbol 


table for this variable, if Y Et. it gives an error message 


sirce that particular variable 1s unknown Cor aut of scape). 
find_var() /* find a variable in variable string */ 
£ 

Takes ar id-name and looks fur it ln the variable 
SET ma. If found, it sets a pointer ta the symbol table 


pointing tao its place in there and returns TRUE, atherwise 
me etlrms FALSE. 
r 

Ne introduced searching for variable names ir the 
Variable string before discussing inserting them. The reason 


15, whenever the parser meets a new variable declaratirırı, 


it is supposed tio add that new variable into both the symbol 


mn 
LI 


table and the variable string. In tbe imza bı. mane 
single routine does baoth these duties. Since the symbol 
table is mat introduced yet, we didn't meet this  rautine 
either. 

Here, the  theary ta satisfy scopirng rule is: mark the 
beginning of a black in the variable string when starting 
to parse a new black, ard delete the mest recent bici e 
variables when exiting fram it. Sa any variable which is mot 
im the variable string is automatically cut of scape. 

.. Symbol Table 

In the Tiny-C implementaticn, the symbol table is 
responsible for variables, function names, label names, ard 
function argumernts. 

Let's first sü with how tu add a new variable into 
the symbol table when a variable declaratları occurs. 
add. var () /* add variable */ 
| Gets a new variable’s id name and gets its type, then adds 
it inta symbol table and variable string. 

e 

Similarly, label declaraticns require label names tez 
be added inta the symbol table, tc. But we shauldn?t 
add labels inta the variable string, since ir 'C” they du 
mat satisfy the same scoping rules as variables. 
add_label () /* add a label into symbol table */ 
ds 


Gets a label, and adds it ta the end о? the symbol table. 
J 


Tu 
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Whenever the parser meets a new label  declaratian, rt 
adds this label into the symbol table by the above routine. 
But it must be smart enough not ta accept duplicate label 
declaraticns. 

One pointer is assigned to point to the beginning of the 
very last  functien in the symbol table. Se, wher the 
parser meets a rew label declaration, it first starts fram 
the beginning af the last fumetimr in the symbol table, arıd 
gues all the way dawn to the end о? it, tm look ifor a 
same label name. If it finds «cme, it gives a duplicated 
label declaraticn errar, since the same label is nat allowed 
tz be declared twice in the same routine in this language. 
The follawing routine dees this job in the Tiny-C compiler. 
dup_lbl¢) /* is duplicate label? */ 
| Checks if the same label name has been declared before. 

X 
DS Label Table 
In the C language, any label referenced by a goto 


statement has to be declared somewhere ir the same funetic. 


Classically, compilers read the source file twice. But the 
number of input/output aperat ins is very ifiper bon tr 
total execut ior speed. Since the Tiny-C compiler 15 
designed as a “ine-pass-campiler", we immediately have 


this problem: detection of undeclared labels. 
Classical two pass compilers read all label declarations 


in the first pass. Si. in the secend pass they can check if 


Гә 
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“gata label" statements are valid. When our 2ne-pass Tiny-C 
compiler meets a gato statement, and if the referenced 
label rame has not been declared yet, it is unpredictable 
if this label is going to be declared in the following 
statements. Ta salve this problem, Tiny-C implements a 
label table, and at the end af every function, Qu 
checks if a referenced but undeclared label exists. 

Whenever a label is referenced by a gota statemert, 
the compiler saves it in the label table by the following 
rut ine. 


save_1lbl() /* save label inte the label table */ 


- Je -,!! 


Inserts a label which is referericed by a "əz ea 
statement into the label table for future checking. 
j 


Ard at the end «af every ar the compiler 
checks if the labels referenced by goto?s were ever declared 
ir vhe Turc bite 


check, labels() /* check labels */ 


{ 

Called by the parser at the end of every function body. 
Checks if labels in the label table are declared in the 
symbol table. 


+ 

ə Pinct iam Calls 

Tiny-C keeps function names and their argument counts in 
the symbal table. In Case oaf a function call, it checks 


if this function has been called before, and if it has net, 


enters its name and argument count into the symbol table. 


fu 
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a EL has been entered befare, it checks 1$ the 


argument  ccurnt in the new function call is the same as 
the one in the symbol table. If the argument counts are 
rat the same, it gives an  "ircornsistent argument count” 
error. 
add furn(fun no) /* add fumctimm inte symbal table — 
char *füun ric ; 
{ 

Adds a function name and its argument court into the 


symbol table irn case af a function call, and if it is the 
west call afethe functicn. If it is not the first call the 
fumctimr is already in the symbol table, Su, it cheeks: $T 
argument counts match. Ir beth cases, it returns the 
furet ior’ s function rumber (basically symbol table entry 
number) ta the parser, tia emit intermediate Code, 
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B Function Declaraticrns 
In the C language, parameter declarations follow a 
Function declaration. Farameter names have tc be given 
inside parentheses immediately following a function name, 
and then they have ta be declared one more time with their 
types. 

The following parameter declarations have ta match the 
enes given with functicn name. Tiny-C has two routines Co 
net this mechanism ta work praperly. 

Sake prime () /* check parameter  */ 

I At the erd cf a parameter declaraticrı, this routine 
checks if that parameter was given as one of the function's 
arguments, ar if it is declared mare than once! EE 


everything 1S preper, it enters the parameters’ type inta the 
symbol table, since the parameter name was already entered 


before (when parsing the parameter list fallawirg the 
furiction name). 
+ 


ful 
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And, at the erd af all parameter declarations, compiler 
has to make sure that all the arguments given with 
function name were declared as parameters. 
chk_parma () /* check all the parameters */ 

{ 

Wher parameter declarations are dare, checks if there is 
any parameter name ir the symbol table, without its type. 
Since parameter names are entered ints the symbol table when 
parsing the parameter list, and types are entered in - there 
when parsing the following parameter declarations, if there 
is any parameter with its type missing, that means it is mot 
declared. 

b 

These are all the data structures, used by the parser ta 
manage variables, constants, labels, function mames and 
arquments, and all remaining structures im the Tiny-C parser. 


The falloewing section improves the parser one more step, amd 


handles the errar checking mechanism. 


E. ERROR СҺЕ ИЛИ с 

A list of error and warning messages used in the Tiny-C 
compiler is given im Appendix D. 

ununun arid warming messages are given by the 


following rautires: 


arr meg (meg na) /* error messages Ge 
char msg rcs 

d 

/* increment error counter x/ 


етти crt; 


/* give line number af the errar */ 
printft(%7d errar! iline migos; 


Tu 
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/* and give the errar message — 


Switch (msq_ rim) 


{ 
case list for all error messages described in Appendix D. 
d 
r 
warning (msg_no) /* warnirg messages */ 
char 1150) Yun; 
£ 
/* give line number af the warning xo 
printf ("Ad warring! ", line_no); 
/* give the message */ 
switch (msq_ ric) 
x 


case list fer all warning messages described in Apperdix D. 
} 


E INTERMEDIATE CODE GENERATION 

In order Co generate code for the target machine, First 
the compiler has ta build a parse tree. Appendix E isa 
list af nodes that farm Tiny-C parse trees. 

Now, the same old heavy-duty parser car shoulder one more 
job: emissians af intermediate code. 

The fallawırıg routine does the intermediate code 
emissicrs, wher| called by the parser. It takes two arguments; 
the rade itself, and the number of the children of this node. 
If there is mat “ary error up to that time, the parser emits 
the code into an emissicn table, (which is ir fact a 
Flattened parse tree) and increments the emit-caunter. 
emit (node, child) /* emit intermediate code 27 


char made, ZE node Kind ta emit * / 
cz TAR, /* 8 af the children belonging to this nadex/ 


Tu 
u) 


{ ; 
/* if there 15 not any errar, give emissions */ 


if (lem yar © 


emitstrlemit cntli = node; 
emitehllemit cntli = child; 
“emin ent 

y 
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G. FOSTFONED EMISSIONS 

There are times when we da nat want ta emit code in the 
same огдес as we parse. Qn assignment statement is a goad 
example far this situaticn. 
Suppose we have the assignment: 

joe = jimmy * 5; 


The parse tree far this statement is: 


assignment 


Variable multiplicaticn 
Jue variable canst arnt 
jimmy = 
Since our parse tree is in flattened farm, the order af 


the intermediate cade emizsicns for the above tree, should 


be: 


jimmy, variable, 3, constant, multiplication, joe, variable, 
assignment 


But this is nat the same order we parse! There may be 


some quick Solutions for this particular problem. But the 
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case might be worse than the above ane. Consider the 
following statement: 

joel (jimmy*#15) ^ manrytt J] = joell]; 

Here, the left value is mat a simple variable. It is arı 
array element with a complex index expression. 

Summarizing, there are cases, when we simply do nat want 
to give emissions immediately. We want to save them, and then 
at the end af same certain expressions we want to emit them. 
This type cf emission is called "postponed emission." 

Up ta row, mur recursive descent parser has beer 
suffering the same prablem. Hut for the sake of simplicity, 
we iğnared it. Now is the time to build same mechanisms ta 
make the parser be able to postpone emissSicrs. 

First cof. all, we have ta make cur emissicn tool more 
flexible. The following is revised version of cur "emit-code" 


Funct ior. 


amit (nade, child) 


idet nade, 
GL: 
4 
/* if there are nat any errars, give emissicans */ 


ii ('epm.<ont ии 


1 
* (emitptri4] + (*( emitptrle] ))) = node; 
* em: pr Sr eh ewemitptrrled ))?) = child; 
rr emvupor lı. 1/7 
X 
> 
Аз сап Бе seer, this revised version is not restricted tc 


emit code into emission table all the time. It can emit code 
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into añ babie iii is addressed by “emitptr" painters. That 
15; by setting these pointers somewhere else, we саг 
"redirect" the emissions. 

The following  ranutine directs emissions into a given 
painter set. This given pointer set is  suppased to be 


painting ta a table, af the same type as the emission table. 


drct_emit(emit_ptr, ptri, ptre2, ptr) /* direct emits */ 


int xemit ptr, /* painter set ta emissions 5 
Eu) : 
pipe LE) 3 
X DOCTI 4 : /* painters to new directian =y 
d 


emit_ptridi=ptri; 

emit_ptriil=ptra; 

emit etrisizmətrs: 

As we have seen before, cur emit-code routine emits 
into a table, pointed to by the "emitptr" glsbal emissicn 
pointers. Bat. if we redirect these pointers into somewhere 
else, don't we lose the address cof the previcus table? Su, we 
have ta be able to save cur previous emission addresses 
somewhere. The following routine saves these pointer 


addresses in given ones. 


"plc emitsíptr.a,ptr. b) /* saving emit pointers = 
int *ptr abl, 

XD” Lu. /* pointer sets to both emit-tables */ 
4 


ptr _al@j=ptr_bd) ; 
pir а211= 061 ub. 
ptr abet pirim 


And the very last problem: We are able ta redirect our 


"emitptr'" emission pointers into some tables (then obviously 
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SUCCeSS1Ve emissicans are then entered into these tables). We 
are able ta save the previous value of these painters. Hut 
what about the "pastponed emissicons'. Namely the ones we 
saved scmewhere else other Chan cur emission table. The 
following routine transfers previously saved emissions fren 


ene table into another. 


tens emits(emit a, emit.b) /* transfer emits x 
lrit *emit aL[1l, /* destination table pointers % / 
xemit bLl, /* source table pointers x/ 
{ 
char E 


Bem (1-05; 1i ub *CemnPmt. bis) 5 şiş) 


X 
xt emit aLiüül + ( *x(emit a[21)2)) = x*x(emit bLàl-i)s; 
A omit alm ( *(lemit aL21))) = *(emit bLil-eai); 
rn E ә Ех ИЛИ: 

> 


H. CODE OFTINIZfATION 


Urider — ri2rnmal conditions, code aptimizaticn can be dane 
an both intermediate cade and target cade. When generating 
target code, compilers attempt ta find the best cade 
generatican sequerice, eliminate Common  sub-expressiocns, 
minimize the number af temporary variables. Aric after 
code gereration is dane, they pass threugh it again one or 
təz times, for peep-hole aptimizaticn, jump  cptimizaticr, 
etc. 

Our Tiny-C intermediate code has a flattened tree 
structure; it is passible to traverse it as a tree. ln order 
Ee da this, we will need some interface routines between 
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this flattened. form and a real tree structure. Ther we can 
lagically look at it as a tree and travel from root to leaves 
or vice-versa. 

In this thesis work, it was decided to generate code as 
quickly and simply as possible. So the Tiny-C compiler uses 
sequential cade generation, ever though it is not the best 
way ta do it. 

Since aur code is going ta be source code for the АМ 
assembler, it is not going to be easy to work on a "text" 
file, ta optimize it. At this paint, we can werk an our 
intermediate code to make it mare effective. Sa, contrary ta 
the classical compilers, aur code optimization is going to be 
only an intermediate code, instead of both intermediate and 
target codes. 

There are several things we do in the code optimization 
phase: 

— Removirig dead code 
— Label/jump optimization 
— Emittirng imbedded assigrnmerits 

The last une cannot be classified as part of code 
aptimization phase, although we deliberately left it ta this 
poirt. We will see why pretty soon. 

l. Dead Code Elimination 

In same cases, the Tiny-C compiler generates dead- 


code. For instance: 
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In the intermediate cade list, there is a nade, 
Called "DUMMY". Sometimes cur parser may emit same cade, but 
then it may realize that this code is rot necessary. In that 
case emitting a "DUMMY" nade makes this previcus code "cut 
of concern" or a "dummy statement". 

In fact, such a tool is not truly necessary, but was used 
in early versions of the compiler. In the following phases 
this "DUMMY" rade was used wnly in the "case" statement. Due 
to constraints on time, it has not been removed. 

As we discussed before, this thesis is a presentation af 
the first version of the Tiny-C campiler, and hopefully a 
reference for its future authors, rather than a discussicn 
about compiler writing techniques. 

Nevertheless, ta simplify the tree we car remove this 
"DUMMY" node and its children. 

ba ават Ето, there may be dead-cade that is gererated by 
the compiler. An example: 

joe = 3; 


goto there; 
joe e jimmy#öş 


*ttejimmyş 
theres 
Here twa statements, ir the third and fourth lines are 
dead code. They will never be used. Su, we can remove this 


dead code fran the parse tree. 


c. Dead Label Elimination 
In general, ary label declaration is automatically the 


beginning о? a new basic block. However if there is na 
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"goto" for this label, then such a label is part of a larger 
basic black. 

Having basic blocks as large as possible removes the 
amount of data transfer between registers and memory. Ini 
other words it reduces the number af "register cleaning" 
operations. 

Sızı, if we detect labels, which are declared but never 
used, removing them is going to be an improvement. 

Je Temporary Variables in the Front End 

There is ane more thing that has to be done when 


passing over the intermediate code Tor optimizing purposes. 


In the parser, arithmetic expressions following a 
"switch" reserved ward are assigned ta same temporary 
variables. These temporary variables are represented by 
"TVAR" nodes, with a temporary variable number. Since the 
result of those arithmetic expressions are assigned tu 
"TVAR" rades, and these variables are compared with "case" 
labels, we have to allocate memory for these nodes just as 
we are going to do for normal variables. The values af 


“TVAR" modes may or may rot reside in their allocated memory 
locations, they may be kept in registers, tao. The register 
manager in the following section will treat them just 


like variable nodes. 


In fact, all variables are referred to by their 
symbol numbers, or their symbol table entry numbers. And at 


this paint, we know our symbol table length. Se we can assign 
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some new symbol numbers tco these "TVAR" nades, and charge 
their names to "VARB" variable nodes. Then the register 
manager can take care of the rest. 

4. Code Optimization, Phase 1. 

The following routine is the first part of the 
intermediate code optimization, and is called just after the 
parser. 
frstopt () /* first pass of optimization * 
| — Detects dead-ccode and replaces it with "NOOP" na 


operation nudes. 


— Detects unused labels and replaces them with "NOOF" 
nodes. 


— Replaces "TVAR" nodes with "VARB" nodes and assigns 
them new symbol numbers starting from the last symbol number 
in symbol table. 

y 


iə Separation af Front End and Code Generator 





Up tm now, our: intermediate cade has been in memory, 
in its allocated location (emission table). The emissları 
table has to be large enough to be able to keep the largest 
size program irv it, because af its fixed size. If the input 
source file is too big to fit into our emission table, Tiny-C 
responds with arn error message. (This is one of the reasons 
it is Called Tiny-C). 

It is possible to pass this emission table tc the second, 
target machine dependent part of compiler, but it wanuld 


nat be efficient. 


There 1s a logical separatinr between parser / 
intermediate code generator ard target code generator. 
The first part is totally language dependent and machine 
independent, and the second part is machine dependent but 
language independent. Sc, putting a physical separation 
between these logically independent units is always a good 


idea, and has been implemented in most compilers. 


Рок this reason we should end the first part af 
this compiler here. Eut before doing this, we have tm pass 
the outcome af this part ta the second part ef compiler 


(basically, the code generator of Tiny-C). 

The code generator 15 going ta need intermediate cade, 
a symbol table, a constant table, and the number of 
temporary variables used by the parser. All this informat ior 
has ta be written in same place for later access by the code 
generatar. 

But we have a last minute problem here, which we 
deliberately ignored Up tao row. This 1s "imbedded 
assignments. " 

€. Imbedded AsSigrments 

In the C language the statemerit ; 
joe = jimmyt+ * 31 
is in fact two different statements: 


joe = jimmy * 33 and a following: 
++ jimmy 3 


The second statement here is an "imbedded assignmert." We 


didn’t emit cade for imbedded assignments up to now, and in 
fact we have ignored this problem on purpose. Because right 
TUZ, when writing intermediate code into a quad file, we 


can simply emit these codes without any effart. 
7a Code Optimization, Phase Z. The Quad File Filter 
The following routine is the second part of the 
intermediate code wptimizer. It 1s called just after the 


first-pass optimizer. 


sendopt () /* secand—pass aptimizatimrı */ 


Creates a quad file named “TC. QQQ" and: 


— Writes intermediate code in this File, without 
"NOOP" codes and with additional imbedded assigrmerts. 


— Marks end of intermediate code 

- Writes symbol table 

- Writes number af the temporary variables (TVARS) 
- Writes constant table 

- Writes name string 


- Arid closes that quad file. 


I. DATA STRUCTURES FOR CODE GENERATION 


The final step is code generation for the Abstract 


Machine. 


As discussed before, the output of this compiler 15 nat 
going to be binary code which is ready ta be linked and 
rur. It is going to be a source file for the AM assembler, 


so it will be readable. 
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О ии this is the second part af the campiler, it 
receives the wark done in the first part. The fallawırıg 
rautine reads a Tiny-C quad file from the disk. 
read quad () /* read quad file */ 


{ 
Reads quad file fram disk ir a sequence of intermediate 


code, symbol table, canstart table and name string. 
5 

Now the compiler has all the information it needs to ge 
ahead and generate code. But right now it does not have any 


tools toa do this. We build same tacls first, ta help the 


cade generation phase. 


The target machine AM theoretically has an unlimited 
number of registers. This is not realistic. So, the Tiny-C 
compiler considers . that AM has a reasonable number ef 


registers, and tries to manage them properly. 

Keeping all the variables ard all the intermediate 
results in registers would be awfully nice. But since this is 
impossible ard we are going to run aut af registers after 
generating a piece af cade, we will need a "register 
marager" ta handle the limited number eu registers 
properly. Tiny-C compiler does not have a single "register 
manager" rcutire. Instead, we will introduce a couple af 
routines, which manage AM registers properly. 

IL Address Descriptors 

As it is known, a compiler cannet keep all the 


variables in registers all the time. CUN it is  obvicus 


gü 


that a variable may be in a register, ör ir tts 
allocated menory Location, or bath, at a particular time. 
A compiler needs a mechanism ta keep track of the current 
addresses of all variables. The following routine sets symbol 


addresses by given parameters. 


адӧг dscr (sym, no, status,reg no) /* symbol addr. descriptar*/ 


char Sym ro, /* Symbol number d 
status, /* address status = 
reg no; /* register number */ 

£ 

Sets current addresses of variables. All variables 

have arı 8—bit value address descriptor. Status may be 

"in-register', "in-memory" av "1in=bcth”. wv bit cf 

this descriptor is ид that mears variable is in its 

allocated menory locaticn. If the value stored in bits à ta 


2777: “evə, means variable is net in any register. I 1670 
different from zera, that value minus one gives the register 
number which symbol is stored in. 
A 

Exactly the same problem exists for constants. Even 
though constant values are fixed and they reside in a 
constant table all the time, the compiler should not transfer 


a constant value into a register if it is already in cme. 


The fallawing rcutine sets a constant address descriptor. 


enst_adr_dser (cnst _no, status, reg_no) 


int creatina /* constant number */ 
char status, /* status */ 
reg no; /* register number */ 

£ 
Sets current addresses of constants. All constarts have 
an 68-bit value address descriptor. If 7th bit of this value 
is 1, and all others are zero, that means the constant is 


not in any register. Otherwise the value of this descriptor 
gives the register number which the constant resides in. 
+ 
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ce. Temporary Management 


There is üne more address problem. Wher 
calculating an arithmetic expression, we may have a couple 
of temporary results. Far instance: 

"joemjimmy * 5 + joe * 33" 


The statement has the following parse tree: 


assigrnmert 


variable addition 
joe multiplicatian multiplication 
variable constant variable constant 
jimmy Ә joe 3 
Here, the compiler calculates "jimmy * О" and "jae € 3" 
first. Since it has to keep these results somewhere 
temporarily, we have toa manage these tempcraries and 
keep track о? their addresses. 


Tiny-C campiler manages temporaries? addresses exactly in 
the same way as it does for variables. In additicmn, it may 
dispose a temporary, So we can use the same temporary number 


Somewhere else later. 


dispose tempí(temp no) /* dispose temporary */ 
char temp no3 
2 


Disposes the given temporary variable. 
3 


+ 
Fu 


When the code generator finishes a statement completely, 
there is no need for any temporary, e Tiny-C’s sequertial 
code generation carder. So at the end af every statement, 
the compiler disposes of temporary variables. 


clean temps) /* clean all temporaries */ 
4 


Disposes all the temporaries. 
F 
Compiler needs a new temporary every time it calculates a 


temporary result. Sop the following rautines provide rew 


temponranries to the code generator. 


get a temp(temp mo) /* get a temporary variable */ 
char *temp noa; 
E 
Finds an unused temporary, returns its number ta the code 
generator, and marks it "used." 
T 


3. Finding Current Addresses 


The compiler should be able to figure out any giver 
token’s address at any time. Tiny-C uses the following 


rautines for this purpose. 


is, inreg (token, no, kind) /* is token in a register? */ 
int token noc; /* taken number */ 
char kind; /* taken kind x7 
€ 

Takes toker kind (variable, constant or a temporary 


variable) and its token number, returns TRUE if it is stored 
in a register, else returns FALSE. 
F 


Lif a particular token is in a register, it can be 


figured out which register this one is. 


E, 
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get reg num(token no,reg,kind) /x net register number */ 


int token nao; - /* taken number */ 
char ren, /* register number % / 
kind; /* token kind */ 

{ 

Takes a taken number and its kind, ard returns its 
register number in "reg" pointer. 
+ 

After some  cperatiacns, variable values may be  anly 


in registers, and may rwt be ir their allocated memory 
locations. The compiler sheuld figure cut if a given varible 
is in memory, ta avoid transferring it into its memory 


Location unnecessarily. 


isinmem(sym no) /* is symbal ir memory? */ 
char sym ncs /* variable's symbol number */ ` 
d 


Checks variable’s address descriptor, returns TRUE if it 
is in memory, else returns FALSE. 


} 
4. Register Management 
A register can hold just one single value. But in 
the Tiny-C compiler, this value can belong ta mare than 


ene token at the same time; far  irstance the same register 
can keep two variables, ene Constant ard two temporary 
variables in it if they all have the same value at that 
particular time. 

We will define the structure of the register manager 


like this: 


#define MXREG 16 /* * af target machine’s registers */ 
#define MXVAR 5 /* maximum # af variables that 

ane single register can hold % / 
int *regtriMXREG1, /* pointers ta register variables */ 
reg arr CMXREG#MXVARI]; /* register variable array */ 
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SE every register has an amount MXVAR «af register 
array (reg arr) locations. These are taken descriptors and 
shows which Peters ии canstants and  tempararies) 
that particular register has at any time. The size of the 
ә ә array is MXREG times MXVAR. 

The register array keeps the names of the takens which 
are loaded in some registers. 

Since particular parts of the register array belong ta 
particular registers, we can easily figure aut which takens 
are in which registers, or which register has which tokens. 

In order to calculate a new result, the compiler has to 
find an unused register ta laad the value. The fallawing 


rautine pravides free registers ta the code generator. 


get_a_reg (reg) /* get a register */ 
char *reng; /* register number */ 
{ 


Checks every registers register array locations. If finds 
a blank one, returns this register ta code generator. If they 
are all cocupied, evacuates we о? them randomly, and returns 
ita 
y 


The compiler should be able ta load a token from its 
Memory location into ane af the registers. The fallawing 


rautine 16s used fer this purpose. 


load in, reg(token,no,nreg,kind) /* load into a register * / 
int token no; /* taken number */ 
char *reg, /* register number =. 
kind; /* token kind */ 
{ 
Takes a register, a taken number and its kind, and 


generates code to load it into that given register. 
} 


After loading this token into a register, its address 
descriptor has tu be set as "ir both register ard memory", 
and the JP M manager should set the members of this 
particular register. 

Suppase we load arı integer value o lritao a 
register. The register manager should kriow that the register 
is keeping a constant value "3", ar which canstart  rumber 
from our constant table is in that register. 

Then, suppose we assign this constant toa a variable, 
like in the statement: "joe=3." 

The register manager should mark that this particular 
register has a constant and a variable in it. 


The following routine helps the register manager to state 


that a register is now holding a given token. 


occupy nregí(token no,nreg,kind) /* ceccupy register */ 
irt token ncs; /* taken number */ 
char xren, /* register number */ 

kird; /* token kind */ 
{ 

Enters given taken into given register’ s register array 
Location, ta mark that this register is holding that given 
token lr! it: 

F 

There may be times wher the compiler assigns a new 
value ta a variable but that particular variable may 
have been stored ir a different register before. Since we 


want to bind a new register to the old variable, we want te 


release its old register. 


4€ 


rel aym reg(sym no) /* release symbols’s register */ 
lit sym. ruz ş 
{ 
Takes a variable, finds its register, arid deletes its 
membership to this register. 
3 


Sometimes the compiler has to store a taken fram its 
register into its memory location. The followirg twa rautines 


da this chore. 


eva symbol (sym no, reg no) /* evacuate register from symbol x/ 


char Syr Yun, meg rics 
{ 

Gererates code ta transfer symbol from register into 
memory. Then sets the symbal’s address descriptor as "ir 
memory" orly. 
py 


eva temp(temp no,reg no) /* take temporary aut af register*/ 
char temp ri, reg rc; i 
{ 

Generates code ta transfer temporary fram register inta 
memory. Then sets its address descriptor as "in memory" orly. 
2 


And there are same cases when compiler wants tc empty a 
register completely. Far instance, we may do this to release 


a register. 


eva regí(reg. no) /* evacuate register */ 
char reg mo? 
{ 


Takes a register number, finds all its members in the 
register array, and generates code ta transfer those members 
to their memory locations if they are met already there. 
(Uses above two routines, actually). 

5 


Before getting out af a basic black, the compiler should 


empty all registers. The following rautine does this task. 
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clear_regs() /* clean registers */ 


{ 
Calls "evacuate register" routine for all registers. 


+ 


le Operands for the Operatars 


In the actual code generation phase, the compiler 
leaks for arı operator, and according to aperator’s type, 
requests the registers for  cperands. The following tw 


routines return integer operands in registers. 


Isad two öprnd(j,ri,ra, step) /* load two integer operand  */ 


irt ME /* painter ta int. code x 
char Kr, /* registers eT 
*step; /* # af the total steps taken*/ 
£ 
Gets twa aperands frem intermediate cade, loads them 
inta two available registers, and returns these register 
numbers ta the code Generator, Since sur parse tree is in 
a flattened farm, the code generator needs ta know where it 
came ir that array-tree, after loading these aperards. Sa, 
the "step" is a variable that tells how many steps have 
been cansumed in the intermediate cade. 
P. 
load, ene, oprnd(i,reg,step) /* load ane integer operand  */ 
irt 10s /* pointer tao int. code 27 
char xren, /* register namber */ 
*step; /* # af the total steps taker*/ 
{ 
Loads the next operand in the parse tree into a 


register, and returns the register number with the number af 
steps walked in the parse tree. 
7 

The Abstract Machine AM, has some boolean mperatars 
that accept only boolean operarnds. But everything in Tiny-C 


has integer type. Sa the cade generator should have same 


tools to convert integer values into  baoleans. The 
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following two routines provide boolean operands for boolear 


operators, whenever they are needed. 


twa_boal(j,ri,r2,step) /* load two boolear operand  */ 


irt Ja /* painter ta int. code */ 
char Bul. uc. /* registers */ 
*step; /* 8 of the total steps taker*/ 
t 
Loads two operands. If they have integer values it loads 


the corresponding boolean values into registers and returns 
them to the code generator. 


+ 
ane bool(i,reg no, step) /* lcad one boolean aprrid */ 
irit m /* pointer tea int. code */ 
char *remg nao, /* register number */ 

*step; /* # af the total steps taken*/ 
{ 

Returns one boolean operand into a register. 

p 


J. CODE GENERATION 

In the Tiry-C compiler, the main rautire in the cade 
generator is a large switch statement as is used in mast 
compilers. The cempiler generates code for the data segment 
First, which is just a memory allocation routine far the 
symbols. Then the cade segment comes as the actual code 


generation phase. The following routine is a subset af the 


code generation routine for the code segment. Each case 


element dispatches tm the code emitter far that case. 


code sep) /* give code segment */ 
4 
irit r /* index variable */ 
char e /* register numbers */ 
/* walk emit array fram beginning to emit-end */ 


for (i4; i(emitend; ++i) 
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/* 
i 


if node has children, (if it is not a leaf) */ 
f (emitchl(liJ'=4 ) 


switch ( emitstrtil ) 


1 
case  IRDD : /* integer addition */ 
code iaddí(i,&step);j; 
breaks 
case MEND : /* end af main function  */ 
feoewlntEGfl,e" susi pur dalaş 
breaks; 
The following rautine is used by the above "code seg()' 


routine and emits cade for integer additions. 


code_iadd(i, step) /* integer additinar 4 
int 1: 
char *step; /* $ af the steps taken an int. cade */ 
{ 
char TE) /* register numbers */ 
temp no; /* temporary variable number 36 
/* load two aperards */ 


— 


İnad təc oprrd(i1-1, 1, a, step) 


they bath minht be ir: the same register, 
if so, allocate ane more register */ 
1f (ri==r2) 
{ 
get_a_reg(&ri); 
terif (EE mv rad) rr (ego e meri) 
+ 


Since addition will be loaded in ri, evacuate it first */ 
eva reg(nr1);3; 


code fer integer additian */ 
forinta ePm add r(Je4d) , vie 4d) Nm veni); 


/* give a number ta this temporary result w 
get_a_temp(&temp mc) ; 
mecupy reg(temp na,ri, TEMP) ş 


/* set temporary's address descriptor */ 
temp varltemp maj=riti; 


/* validate emission array for sequential code generation */ 
emitstrLil-zTEMFP; 


emitchllil=*step+l; 
emitstrli-lj=temp na; 


Some sample C programs and the code generated far them by 


the Tiny-C campiler can be found in Appendix F. 


V. CONCLUSION 


Precise, understandable arid enfarcable interface 
standards can provide a way to improve efforts toward 
portable software. In the Tiry—C implementaticrı we showed a 
way to imprave the pragramming capabilities of AM, and ercon- 
raged programmers to use such a portable and stardardizable 
machine in high level languages. 

Unfartunately, this implementation is mat completely 
satisfactory. Because mf restricted capabilities in the 
target AM machine, the Tiny-C compiler dues not fully 


support application programming- Same о? these restricticans 


are: 
— Based om the principle af rescurce abstracticn, AM has 
strictly defined data types.~ Sirce it presently does mot 
support conversion between twa types, it is a higher 
level concept than the "C" language. SEN Contrary ta 
usual implementations, this thesis had arn opposite 
direction: production of a lawer level toal in a higher 


level ervirarmert. 


- The AM abstract machine dees not yet have a complete 
linker. Sin, the user is forced tu keep the whale 
program and input/output library in one single module, 
which is extremely inconvenient in application environ 
ments. 


— [he current version af AM is arn emulator, rather than 
hardware, Even thaugh this is canvernlent far a develop- 
merit phase, it is rnat going ta be an easy-to-use product 
fur users. 

Ən further develapment that could be done for ar 
improved AM envireanment might include: 


- A linker fear AM 


- Type cemversicn betweer AM data types 
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APPENDIX A 


GRAMMAR FOR TINY—C LANGUAGE 





PROGRAM : 


Program: 
(pre-precesscar) * <data-definitican) * 
(function-definiticr)- 


FRE-PRECESSOR_ : 


pre-precessir: 
"#define" (file-definitiaor? 
"H#incliude" (file-definitior? 


file-definitiaon: 
Tu (filename? 20 | 
nc? (filename) 27” 


filename: 
(identifier? (filetype)? 


filetype: 
e (identifier? 


DATA DEFINITIONS : 


data-definiticm: 
(sc—specifler) ? (declaration? 


sc-specifier: 
"autc" | 
"Stable | 
"extern | 
"register" 


declaration: 
(type—specifier) (variable-declaration-1list> : 


type-specifier: 
"char" | 
"short" | 
e | 
"Ilong" | 
"unsigned" 
"float" | 

"double" 


variable-declaratiur-1list: 


(variable-declaratian) (mare-variable-declaratians) € 


nare-variable-declaratincns: 
ə (variable-declaratian) 


DECLARATIONS =: 


variable-declaratiocn: 
H"? (identifier) (irndex-declaratioacr?? 


index-declaration: 
MEE (Carstart—expressicmrı) (1) gos 


Maa ializepn: 
- (primary)? 


primary: 
(identifier) | 
(constant? | 
senar=def1rmi1t10r) | 
(string? 


Char-definiticr: 
a t (character) —. 
string: 
77: (character) * 


16 91 ıt 


FUNCTION DEFINITION : 


füuwaeticrn-definition: 
(type—specifier) ? (emet ın. deciarat imi) 


Furniction-declaratiori: 


(identifier) ve (identifier-list) ? Ы 


iderntifier-list: 
(identifier? (mare-identifiers) * 


mare-iderntifiers: 
/7” (identifier) 


funct ior-body: 
(type-decl-list? (Campaund--st atement> 


Cn 
Cn 


(irnitializer?? 


(furiction-bady»? 


type-declaraticr—list: 
(parameter—declaratımır) * 


parameter-declaratiur: 
(type-specifier? (parameter-declaraticrn-list) 


parameter-declaration-list: 
(parameter) (rior e-parameternrs? 


mire-parameters: 
—. (parameter) 


parameter: : 
) 3*7? (identifier?) (irndex-declaratioen?? 


STATEMENTS : 


statement: 
(compcund-statemernt?) 
(furiction-call? px 
(assigrment—statemert) 
(if-statement? 
(while-statement? 

- (do-statemerit > 
(fur-statemerit > 
(switch-statemert > 
(break-statemerit > 
“cantirue" . 
(Freturr—statemerit > 
(gata-statement> 
(label?) 


ə 
3 


TT 
3 


cempcirid-statemenvt: 
mv (declaratliorı? € {statement>+ ">" 


Funetion eal: 
(identifier?) TS (expressiorn-list?) Зод 


expression-list: 
(express l cri) (Müre—expresslans) * 


mare-expressicns: 
250 {expressicr) 


asslgnment—-statemerit: 
(assigrimerit > | 
Cinerement al-expressian) 


assigrment : 


(Ivalue) "=" (logic-expressior) 
(lvalue? (shift-assignment-op?) (shift expressiacr? 
(1уајҹе) (bitwise—assiyrmert—ap) (bitwise-expressiocrn? 


snıft—assignmeni—up: 
..." | II д —11 | ...-" | aei | no S | lo de E | it Cia" 


bitwise-assigrment—ap: 


" ==" | Hu 11 | " | =1 


incremental-expression: 


ep" (lIvalue?) | 
e (lvalue»? | 
(I value? ep | 


(lvalue» NEZ 


if-statement: 
BC" O (logic-expressiaor? Gë: (statement) 
(else-statement?? 


else-statemert : 
"else" (Statemerit> 


while-statemerit : 
"while" ww (logic—expressimr) E (statement? 


do-statement: 
Mela" {statemerit> "while" a (logic-expression) ” 


"o, M 
3 


for-statement: 
"Wc" c (assigrment—list)? "3" (logic-expressiocrn? 


: (assinrımert-list?? ")" (statement?) 


assignmernt-list: 
(assignmernt-statement? (more-assigrimerits) * 


more-assignnmernts: 
"e (assignment-statemerit? 


switch-statemerit : 
"switch" 57 (arithmetic—expresslur) e ҝӧ 
{case-stmt>+ MM 


Case-stmt: 
'case" | "default"! (constant-expressicrn? KS 
(statemert) * 


break-statemerit : 
"break" кы 


returm-statemernt: 


"return" (expression) -. 
goto-statement : 
"gaota" (identifier? с 


label: 


(identifier? 


EXPRESSIONS = 


expressicn: 
(String? 
(pecinter-expression) 
(address-expressiocrn? 
(lagic-expressior? 
(incremental-expressiocn? 


pzirter-expressimri: 


[^ (array-element? | 
тәг (identifier? | 
x. (arith-expr? 0 


address-expression: 
EE (array-elemerit > | 
—. (identifier) 


lagic-expressicm: 
(lenic-termə? (mare-lagic—-terms) * 


mire-1lagic-terms: 
Ge ES Ee (logic term? 


lunic-term: 
(logqic—-factene (mare-logic-factars) 


mare-lagic-factars: 
"&&' (lagic-factar? 


logic—taçc bem 
00 (bitwise-expressicm) | 
a Су (logic-expressicrn? ge ЛК 


bitwise_expressicm: 
Dee (bitwise-term) {(more-bitwise-terms) * 


maore-bitwise-terms: 
ҝӧ. (bitwise-term) 


bitwise-term: 
(bitwise-factar) (mare—bitwise—factars) * 


mare-bitwise-facturs: 
“2.0 (bitwise-factur) 


bitwise-factar: 
(bitwise-elemernt > (mare—bitwise—elemerts) * 


nare—-bitwise-elements: 
ağ” (bitwise-element? 


Ditwise-element: 
(compare-exrp) | 
50 (bitwise-expressimn) 77 


compare-expresslar: 
(campare-term) (mare-cempare-terms) * 


more-compare-terns: 
(equality-ap?) (Cümpare—term) 


equality-op: 


campare-ternrm: 
(cumpare-factur) mare-=-campare-facturs)* 


mare-cuompare-factors: 
(relaticon-op? (compare-factaer) 


relaticn-ap: 
IL GH | ә tı | nm c= | = 


cumpare-factur: 
(shift-expressicn) | 
çər (ciompare-expressicn? i xà 


shift-expressicr: 
(1уајҹе) (shift—ap) (arith-expressicmn) | 


(arith—expresslorı) 


ZniyTU-ops: 
1552 M | ÄR e 


arith-expression: 
3-39 (term) (moare-=terms) * 


more-=terms : 
(add-ep? (term?) 


add—ap: 


term : 
(factor?) (Mmare-fTactars) * 


mare=factors: 
(mult=op) (factor) 


muit=-ape 
изи | ә. | ду 


factor: 
"KE: (arith—expr) 5” | 
(constarnt-expressiorn) | 
(character-defiriticrn? i 
(functicorn-call)? | 
(incremertal-expresslorı) | 
(lvalue? 


canstant-expression: 
(constant? | 
(constant-iderntifier) 


lvalue: 
(array-elemert > | 
(identifier) | 
(painter—expressicmrı) 


array-elemert: 
(identifier) (index? 


index: 
a (arith-expressiorn? ра 
SEMANTIC CONSTRAINTS : 


(1) Prehibited for extern and parameter declaratians. Mandatury 
far others, 


eua 


ARFENDIX E 


TINY-C FARSER VERSION 1 


extern char pr e eyexvem, “Tunc end; 
exterri Tt bufp, glbptr, line roj; 
program () : /* Tiny-C Program x^ 
{ 


while (prepros()) 


4 


while (data def()) 


4 


Tı uc def) qata quits 
while ('match(EOF)) 
| func end=FALSE ; 
if (' fine def ()) 
gatorguit: 
returr (TRUE) 3 
quit: retuarrn (FALSE); 
A 
preprcs() /* pre-precessür */ 
: int oldp=bufp, linep=line na; 


glbptr=bufp; 


if (matchtaken("#define ")) 


S 
1. lenst-i6(?) gota quit 
: 
A constant (00 gota quit 
: 
else if (matchtoken("#include ")) 
4 
15525 :file.dSef gota quit; 
5 
else bela cut s 


&1 


returri (TRUE); 

quit: bufp=aldp; line _no=linep;  nextch-bufitbufpl; 
returr (FALSE) ; 

A 


file def) /* file definition */ 
d 

int  oldp-bufp, linepzlire no; 

char limiter; 


if (mateht(*" limiter=? ''? 
, 

else if (match(?” (?)) limiter=? © 
3 

else goto quit 


if ('filename()) gata quit; 


if ¢(limiter==’7 '? ) 


{ 

if ('!mateht*'*)) gata quits 
> 
else if (imatch(7??7)) goto quit 


3 


return(TRUE) ; 
quit: bufp=aldp; line no=lineps  nextchzbuftbufpl; 
vetimrri(-AESsEr 


+ 
filename () /* file name */ 
{ 

if (!id()) returr (FALSE) ; 

if (filetype()) 

$ 

vetunr (TRUE) ş 
A 
Filetype () /* file type */ 
{ 


int oldp=bufp, lirepelire no; 


if (ilmatez t») gota quit; 


mn 
Tu 


if wand ©) goto guit; 


returr (TRUE) ; 


gult: bufpzaldp; line rnao-lirnep;  nextchz2buftibufpl; 
return (FALSE) ; 

+ 
data def() /* data definiticgm */ 
{ 

int oldp=bufp, linep=line_rios 

gloptr=bufp; 

if (se =рсТг()) 

5 

mir *t(delrtiom()) 

return (TRUE) ; 
quit: bufp=a1dp; line nazlinep; —nextech-buflbufpi; 
return (FALSE) ; 

E 
sc_spcfr() /* sc specifier */ 
{ 

if (matchtaken('"autaoa ")) 

5 

else if (matchtaken("static ")) 

3 

else if (matchtakerní("extern ")) 

5 

else if (matchtakerní("register ")) 

3 

else return (FALSE) 

5 

return (TRUE) ; 
E 
eline nt /* declaration */ 
E 


int aldp-bufp, linep-line, no; 


ire( wp spfit)) Gen) e 


if (!var dec list()) gota quit; 


(n 
(y 


if “tl masrehn:”?””. gato quits 


ret urri TRUE) s 
Gur. bufpecildpş line mazlirep; nextch=bufCbufpd 5 
return (FALSE) ; 


+ 
typ spf () /* type specifier */ 
{ 
LE (matchtaken ("char ")) 
eee 1f (matchtaker("shurt ")) 
ade if (Matchtduep e int ə 
“ə if (matchtaken("laong ")) 
ee if (matchtoken("unsigned ")) 
2m if (matchtakení("float ")) 
ə if (matchtoken("dauble ")) 
ess return (FALSE) 
: 
return (TRUE) ; 
+ 
var dec list () /* variable declaration list */ 
€ 
if (!vardcir()) return (FALSE) ; 
while (morevardcls()) 
3 
return (TRUE) ; 
5 
morevardcls() /* mare variable declaraticns */ 
{ 


int oldpzbufp, linepzline rnc; 


if (mae. 0) gato quit; 


if ('!'vardcir()) gota quit; 
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retira TRUE) 

guit: bufpzaldp; line nazlinep;  nextchzbufLibufpl; 
return (FALSE) ; 

V 


vardclr () /* variable declaration */ 
{ 
int oldpehbufp, linep=line_no; 


if (match (7? #7) ) 


3 


me C'idt)) mük ıı 


ıb indxdelr()) 


3 
if (initializer()) 
d 

returri( TRUE) ; 


mk: bufp=aldp; line noe=linep3;  rnextch-bufLibufpl; 
return (FALSE) ; 


+ 
imdxdelr() /* index declaratior x” 
t 

int aldp=bufp, lirepzline nao; 

pe (Mater E* 9) guta quit; 


if (cnst expr()) 

3 

we 'match(')')) ә бес̧ији с 
return (TRUE) ; 


uit: bufp=aldp; nextchebufLbufplş line na-linep; 
returr (FALSE) ; 


initializer() илк initializer */ 
d 
int oldpzbufp, lirepelire no; 


if (imatch(7”)) goto quit; 


119 mətechl” €”)? 


if ('expresslor()) gota quit; 
while(!'moareexpr()) 

5 

if (match) goto qur. 


x 
else if ('expressioní)) gote guru; 


return TRUE) ; 

quit: bufp=a1dp; line nozlinep; nextch=buflbufpd 5; 
return (FALSE) ; 

D 


func def () /* function definition  */ 
{ 
int oldp-bufp, lirnepzlire ro; 
glbptr=bufp; 
if. (typ spr) ) 
3 
if O func delr goto quit; 
glbptr-zbufp; 
if (!func body()) Belt quit; 
func end-TRUE; 
returr( TRUE) ; 


dur. Dufp-oldp; rextchsbufLbufpl; line mazlirep; 
return (FALSE) ; 


Türe “ee İr() /* function declaration */ 
4 

int oldpz-zbufp, lirnepzline no; 

it Cide) goto quit; 

if (!match(? (°)) goto quit; 

if (idnfrs()) 


if CUmavecm e gota quits 


ce 


ecir meme); 


ıt: bufp=a1dp; nextch=buflbufpl: line no=linep; 
returr (FALSE) ; 
> 
idrifrs() /* identifiers  */ 
{ 
157 (ic 0?) returr (FALSE) 
5 
while (more _id()) 
5 
return (TRUE) ; 
E 
more id) /* more identifiers x / 
{ 
int aldpzbufp, lirepzline rnc; 
NEC C'match(*,*)) gato quit; 
LU ic?) gsto quit; 
returr (TRUE) ş 
qu bö bufp-oldp; nextehsbuflbufpi; line mazlirep; 
returr (FALSE) ; 
E 
func body () /* function body 16. 
4 
int oldpebufp, linep-lire noc; 
if (!type dec 1586()) goto quit; 
glbptrzbufps; 
if (icmpr: stmt()) gata quit; 
return (TRUE) ; 
quit: bufpzaldp; rextchzbuftbufpl; line rmzlirep; 
return (FALSE) ; 
D 


& 


type dec lst() /* type declaration list */ 


x . 
int oldp=bufp, linepzline rw; 


if (par dclrtiont()) 


3 
while (par dclrtiont()) 


5 


returr (TRUE) ; 
guit: bufp=c1dp; line mazlirep; nextch=bufCbufpd ; 
return (FALSE) ; 


par delrtiovi() /* parameter declarations */ 
£ 
int oldpzbufp, lirepzlirne no; 


if (!typ spf()) gota quit; 
12 (раг дес 1156 ()) gota quit; 
1f (maten (1? )) guata quit; 


return (TRUE) ; 
quit: bufpzaldp; line no=linep; nextch=buf Coufpd 5 
returr (FALSE) ; 


+ 
par dec _list() /* parameter declaration list */ 
{ 

if (!parameter (0) return (FALSE) 

5 

while (morepardols()) 

5 

returr (TRUE) ; 
+ 
morepardcls() /* more parameter declarations */ 
{ 


int mldp=bufp, linep=line_na; 


if @imaten >) gata quits 


1f (!parameter()) gato quit; 
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guit: 


> 


parameter () 


í 


Git: 


int oldp=bufp, 


Re ECKE 


bufpzaldp; line _no=linep; nextch=buflCbufpl ; 


return (FALSE) ; 


if (mateh(x*')) 


3 
auf ( 
Bt. 


/* parameter */ 


linep-lirne no; 


.1d6 (7?) got q use, 


indxdelr() ) 


гебиги(ТЕЈЕ) : 
bufpzaldp; line no-lirnep; mnextch=buflCbufpd ; 


stmt () 


d 


ır” ( 
: 
else 


3 
else 


3 
else 


3 
else 


3 
else 


3 
else 


3 
else 


Cr” 
> 
else 


3 
else 


3 
else 


77177 
> 
else 
e if 
A 
else 


return(FALSE) ; 


cmpr: stmt ()) 
if (if stutU?) 
if (uhile.stmt()) 
if (dc stmt()) 
if (för stmt) 
if (swtc_stmt()) 


if (break stmt()) 


/* statement ” 


If (matchtokemt continue “,1)) 


Gmətch” ии) 
if (rt9rn stmt) 
if (gota stmt()) 


mr cfune cab ) ) 
matem) 


if (asrimt()) 
Clirməest eh (? 5” 99 


if (label()) 


Gece aia Cs 


goto quit; 


noto quit; 


63 


3 
else if (matchí?*?;?)) 


3 
else gota quit 


returr (TRUE) ; 
quit: returrn (FALSE); 
» 


cmpr stmt) /* campaund statement */ 


{ 
int cldp=bufp, lirvepmlire rao; 


if (match tx goto quit; 

while (dclmeticr:()) 

ie Guy) müz nunun: 

while (stmt ()) 

if C'match (? +?) ) goto quit; 
return TRUE) ; 


Quit bufp=a1dp; yextchebufLbufpl1ş line no=linep; 
return (FALSE) 5 


TUNE cal ə) /* function call */ 
{ 
int oeldp=bufp, linep=line_ne; 
if ida Gt quit; 
if matem ( (5 gota quit; 
if (expr_lst()) 
if -“timakdriehn””9) gota quit; 
return (TRUE) ; 
quite bufpzaldp; nextch=buf Cbufpd ; line_no=linep; 


return (FALSE) ; 
+ 


dé 


expr_lst() /* expression list */ 
4 


if (lexpressicrı()) return (FALSE) 


3 
while (Mareexpr()) 


3 


retunnt TRUE) ; 


moreexpr () /* more expressicns X 
{ 
int «ldpzbufp, lirnepzlire nü; 


57: mas.eht”,”)) gota quit; 
if ('expressian() ) poto quits 
return (TRUE) ; 


Ent: bufp=aldp; nextch=buf Cbufp) ; line no=linep; 
return (FALSE) ; 


p 
asrimt () /*® assignment statement */ 
{ 

if (assign()) 

5 

else if (iner_stmt ()) 

5 

else return (FALSE) 

5 

vetunr (TRUE) ş 
A 
assign() /* simple assignment */ 
{ 


int oldp=bufp, linep-zline nc; 


if ('lvalue()) goto quit; 
if (match(*=”)) 

{ ire( “iyce expr1(1)) gota quit; 
A 


yi 


else if 
4 1 
) 
else if 
{ 1 
jr 
else 
J 
güre: 


shf asm opí) 


{ 


btw_asm_ap() 


* 


En 
3 
else 
3 
else 


9 
else 

8 
else 


3 
else 

3 
else 


3 
else 


3 


(shf asm ор()) 


f (!shf expr()) 


(btw asm op()) 


f (!btw expr ()) 


returr (TRUE) ; 


bufp=aldp; 


ri'extch 


returr (FALSE) ; 


(matchtaken ("+= 


if 


lf 


if 


if 


lf 


at; 


/* shi 


5200 
(matchtaker ("-= 
(matchtaker("*— 
(matchtaken("/= 
(matchtaoker ("7/4 
(matchtaken("))= 
(matchtaker (" ((= 


returr (FALSE) 


return Re); 


Taf? 
7 
else 


3 
else 


9 
else 


(matchtaken("&= 


if 


if 


/* bitwise assigrnmerit 


gr. 
(matchtaken ("*= 
(matehtoken(" f= 


returri 


return (TRUE) ; 


—buflbufpi; 


gotea quit; 


goto quit; 


gota quit 


line_na=linep; 


ft assigrmert operatar % / 


..0)) 
".0)) 
CTIE 
..0)) 
".0)) 


".0)) 


aperator  */ 


"y 4)) 
Ru” 


(FALSE) 


m 
Tu 


mersini () | /* incremental statement */ 
{ 


irit oldpebufp, lirepelire noc; 
char pre opzTRUE; /* pre-operatav */ 


if (matehtaker ("ee ",0)) 


else if (matchtcken("-- ",@)) 

; 

else pre opzFARaLSE 
if ('Ivalue()) ` goto quit; 


if (pre op) 


4 

if (matchtaker("** ",u0)) 

3 

else if (matchtaken("-- ",@)) 

: 

else gato quit; 
A 


return (TRUE) ; 

cut bufpzaldp; yextchebuftbufpl, line ne=linep; 
return (FALSE) ; 

+ 


1f_ stmt () /* if statement */ 
{ 


r (imavcT Kevin ", 1)) goto quit; 


ın c match Cc w» gots quits 
If ('Lne expo» goto quit; 
ieee ci mat Chm (ey) goto quit; 
if (istmt(9) gato quit; 


if (else. stmt()) 


“a 


returr (TRUE) ; 
quit: return (FALSE) ; 
$ 


N 
GJ 


else stmt () /* else statement 

{ 
if ('matchtoken("else ",1)) gata 
if Clstuv””” gato 


returr (TRUE) ; 
Qu»: return (FALSE) ; 


y 

while stmtí() /* while statement 

d 
if ('matechtaken("while ",1)) goto 
if (lmateh ti”? gata 
if ('lge_expr()) goto 
if (iməten”?”7) goto 


if ('stmt()) gate 


retiurr (TRUE) ; 
quits return (FALSE) 3 


A 

dıcı “un /* da statement 

t 
if (!matchtaekení('"dao ",1)) gata 
if (!stmt()) goto 
if ('matchtokern ("While ",1)) guta 
if ('match(? (°)) goto 
if ('lgc_expr()) gata 
if (matem nD goto 
1f ('match(* 3*)) gota 


returr (TRUE) ; 
duz neturr (FALSE); 
A 
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*/ 


guit s 


quit; 


*/ 


gire; 
quit; 
quit; 
get; 


qe; 


*/ 


quit; 
quit: 
quit; 
quit; 
giri; 
quit; 


quit; 


far stmt () /* far statement 


€ 
if matentokemma rom ".12) guta 
if (!match(? (?)) gota 
if (asn 1st()) 
3 
ir (lmateh (7290 gota 
if (luc exprt)) gat 
Ir (metet) gata 
if (asr 1st()) 
: 
mu ('matchnt(')*95 gata 
if (istmt(9) gota 
return (TRUE) ; 
cut. retirnmn(FALSE) ; 
y 
asn list) /* assignment list 
í 
if (!asrmt ()) ret urr (FALSE) 
3 
while (more asrmt (2) 
d 
"metunr (TRUE) ş 
y 
more asnmt () /* more assignments 
{ 
int mldp=bufp, lirepzline nc; 
miU matchí?*;,*2)) gota quit; 
if (!lasrimt ()) OCIO: 
return TRUE) ; 
GUIT: bufpzaldp; nextehzbuflbufpi; 
returr (FALSE) ; 
ј 


*/ 


quit: 


quit; 


quits 


quit; 


qurt. 


quit; 


quit; 


e 


*/ 


line no-1inep; 


swtc_stmt () /* Switch statemerit */ 
{ 


if ('matchtoken("Switch ",1)) nola quit; 
if (!match(? (?')) gato quit; 
if ('art expr()) gota quit; 
if (match DD goto quit; 
if (fmatech eae guta quit; 
if (icase.stmt()) gota quit; 


while (case_stmt()) 


3 
if (imateh Cm) goto quit; 


return (TRUE) ; 
quit: return (FALSE) ; 


E 
case stmt () /* case statement 27 
{ 
if (matchtcken("case ",1)) 
{ if (lensi exp 0) nato quare 
+ 


else if (matchtoken("default ",u0)) 


3 
else gota quit 


3 
if ('Wwaatcht*s*)) goto qılır. 


while (stmt ¢)) 


3 


returr TRUE) ; 
guit; return (FALSE) ; 


+ 
break stmt) /* break statement */ 
{ 
if (lmatchtaker ("break ",1)) goto quit; 
if ('matenG = 7) goto quit; 


/6 


returr (TRUE) ; 
guit: returrı (FALSE) ; 
+ 


ptrn stmt .) /*® return statement Ry 
{ 


if ('!matchtzoken("returrn ",1)) EH 
if (expressiont()) 

5 

mL c match«t*s*)) goto quit; 


return (TRUE) ; 
cuit: return (FALSE) ; 


d 

goto stmt) /* gata statement */ 

{ 
Er e('matchitaken (“gate “,1)) gata guit; 
if (!id()) goto guit; 
if ( İMƏC ct” 7)) uc dını. 


return (TRUE) ; 
guit: return (FALSE) ; 
» 


label () /* label */ 
7 


int oldp-bufp, lirepzline ro; 


npe rdí)) fet quium 


Eer 67>) göta guit; 


return HRUE): 
quit. bufp=caldp; nextch=bufCbufpd 5 line mazlirep; 
vetüpu(FAL SE); 


“m 


expressiar() /* expressluar */ 


{ 
if (CString. ) > 
ae if (prntr_expr()) 
ee if (addr_expr()) 
Janse if (lgevexpr o> 
GES if (iricr_stmt()) 
dem returr (FALSE) 
2 
vetunr (TRUE) $ 

» 

prit" expr () /* poiriter expression €/ 


int sldp=bufp, linep=line_nes 


if <'maken< <7) ucu guit; 


if (array_elm()) 


3 


else if (:4()) 


3 
else if (match(?” (?)) 


x 
if (!lart_expr()) gota quit; 
1 AMS ten)? J) gato quit; 
x 
else gota quit 


3 


return( TRUE) ; 

ques bufpecldpş : nextch=bufCbufpd ; line nozlinep; 
return (FALSE) ; 

E 


addr_expr() /* address expression */ 
d 
int aldpzbufp, Linepeline no; 
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if ('!matchí(?*&?)) 


if (array elm()) 
2 


else if (ıd()) 


3 
else 


3 


retiurr (TRUE) ; 
bufpzuldp; 
return (FALSE) ; 


quit: 


lgc_expr() 


q 
MI. C' lee trim ()) 
57 (Ig trms()) 
9 
return (TRUE) ; 

^ 


lg trmst) 


{ 
int oldp-bufp, 
if (!matchtaken(" | 
MaE C Inc trm) 
return (TRUE) ; 
ame; bufp=aldp; 
ret urr (FALSE) ; 
5 
lazer 
{ 
Tic Inc fct) 
: 
while (Ig fcts()) 
3 
return TRUE) ; 
* 


ес) кл s 


goto quit 


nextch=buf Coufpd 5 


/* lagic expression 


return (FALSE) 


/* logic terms 


linepzlirne roj; 


121 um SGS ÜTÜ: 


gata quit; 


nextch=buf Coufpd 5 


/* logic terra 


return (FALSE) 
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line_ro=linep; 


*/ 


*/ 


line naoa-linep; 


*/ 


lo ets) /* logic factors */ 
* 
int ocoldpzbufp, linepz-line noc; 
if ('matchtoken("&& ",2)) goto quit; 
lfe('Inocam&set) gota quit; 
return TROUB) s 
quits bufpecldpş nextch=buflbufpd y; line nazlinep; 
return (FALSE) ; 
+ 
lac fort) /* logic factor */ 
{ 
int cldpebufp, linepzlirne ro; 
if € match”) /* wnary aperatar */ 
3 
if (btw_expr()) 
: 
else if (match(?” (?)) 
t 
if (ilac expr()) gato quits 
if ('match(7)7)) goto quit; 
2 
else Bete quit 
: 
return (TRUE) ; 
guit: bufpzaldp; nextehzbuflbufpi;ş line na-1linep; 
returr (FALSE) ; 
) 


btw_expr () /* bitwise expressicn 


{ 
int mldpsbufp, linepzline no; 
if (mateh(”“!')) 
3 
if (!btw_trrn()) gota quits 
while (bt_trms()) 
: 
returr (TRUE) ş 
quits bufp=a1dp; nextch=bufIbufpl; 
return (FALSE) ; 
D 


au 


*/ 


line maezlirnep;ş 


bt trms() /* bitwise terms зи 
4 


int oldp-bufp, linepzline no; 

17 Cute cine? | 7): ) gato quit; 

if (!btw trm) gota quit; 
return(TRUE) ; 


qu ids bufp-aldp; rnextch-bufLbufpl; line rnco-1inep; 
meturn (FALSE) ; 


2 
btu trm() /* bitwise term */ 
{ 

if bUu fet () ) return (FALSE) 

3 

while (bt fets()) 

; 

return TRUE) 3 
+ 
bt fects () /* bitwise factors */ 
X 


int oldp-bufp, linepzlirne riciş 


5? (imaten(?””)) got ossut; 
if ('!btw_fct()) gota quit; 
return (TRUE) ; 


guit: bufp=oldp; mexteh-buflbufpi; line noslinep: 
return (FALSE) ; 


+ 
btw_fot() /* bitwise factor */ 
{ 
if ('btw_elm()) return (FALSE) 
d 
while (bt elms()) 
d 
returnr (TRUE) ş 
+ 
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bt elms () /* bitwise elements 
1 
aldpzsbufp, 


int linepzlirne rnc; 


if (UTmavaehn 00 gate quit; 


if (!btw_elm()) gate quit; 


return (TRUE) ; 


quit: bufp=aldp; nextch=bufCbufpd 5 
return (FRESE): 

+ 

btu elmt() /* bitwise element 

{ 


int cidp=bufp, linep-lirne no; 
if (cmp_expr() ) 

5 

else if 
4 


(match (? (?)) 


if (!'btw_expri()) 


x 


line _no=linep; 


*/ 


goto quit; 


if (maten?) )) gota quit; 
> : 
else goto quit 
5 
returr (TRUE) ; 
Eli e Dufp-aldp; yextchebuftbufplş line nazlirnep; 


return (FALSE) ; 
+ 


сир expr() /* campaund expressian 


4 
int sldp=bufp, linep=line_nea; 
if Ee tra) return (FALSE) 
5 
while (cp_trms()) 
return (TRUE) ; 
y 


(D 
fu 


*/ 


cp.trms() /* Compalnd terms */ 
4 
int oldp=bufp, linep=line_nes 


if (!equ ap()) goto uult; 
tt C6tempo- tra) ) gota quit; 
return (TRUE) > 


quit: bufp=žcldp; nextch=bufCbufp]; line no=linep; 
metur (FALSE) : 


$ 
egu üp() /* equality aperatars */ 
d 
if (matchtoken( "== ",4)) 
5 
else if (matchtakeri("!= ",4)) 
5 
else return (FALSE) 
3 
return (TRUE) ; 
Y 
enim Crem) /* ceoempourd terrm */ 
{ 
m5. (lemp fet () ) ret urri (FALSE) 
9 
while (cp fects()) 
3 
ret urri (TRUE) ; 
p 
en evs) /* campound factors */ 
4 
We nel epo göto quit; 
ınuemeə fev (22 gato quit; 


returr (TRUE); 
guit return (FALSE) $ 
D 


rel op) 


4 


cmpisbet xo 


t 


GUIT: 


if (match(? (?)) 
{ if (match (?=")) ; 
A 


else if (match(7?7)) 


€ if "natem ="0))> 
? 
else 


4 


return (TRUE) 3 


return (FALSE ) 


/* relational operator */ 


/* compound factor */ 


int cildp=bufp, linep=line_nes 


if (shf_expr()) 

a 

else if (match(?” (©)) 
4 


if (!cmp_expr()) 


if (imatch(7)?)) 
” 
else 


5 


returr (TRUE) ; 
bufp-aldp; 
returr (FALSE) ; 


nextch=buf Cbufpd ; 


rnextchehufLbufplş 


goto quit; 
goto quit; 


gote quit 


line rio-1linep; 


/* shift expression */ 


geta quit 


line mazlinep; 


shf expr() 
4 
int xeldpebuafp, lirvepmelire. no: 
if (shf_init()) 
3 
if (!art expr()) 
3 
returr (TRUE) ; 
quit: bufp=aldp; 
return (FALSE) ; 
J 
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Sif ira) /* shift expression-initial */ 
{ 
int cldp-bufp, linepzline no; 


if (!lvaluet)) gotomguit; 
if (!shf_cap()) gota quit; 
return (TRUE) ; 


guit: Dufp-aldp; nexteh-buflbufpi; line mnazlirep; 
return(FALSE) ; 


D 
shf apt /* shift cperatar m. 
{ 

nu (matehtückeri "2? 358mm» 

: 

else if (matchteoker(" (€ ",ü)) 

^ 

else returr (FALSE) 

: | 

return (TRUE) 5 
+ 
art expr () /* arithmetic expression */ 
{ 


int aldp=bufp, lirepmelire nc; 


lf (maven? —’ )) /* nary operatar SES 


9 


if ('term() ) goto quit; 


while (more_term()) 


3 


return (TRUE) ; 
GG bufpecildpş nextchzbuftbufpl; line na-linep; 
return (FALSE) ; 


+ 
more term) /* more terms */ 
{ 

1508 cud -ep(?) усе quit; 

if ('termí)) goto guit; 


ret una TRUE мн 
gu ice returrı (FALSE) ; 


js 
add apt) /* additional operator  */ 
£ 
if (maven e: 7) 
d 
else if (match —'99 
d 
else return (FALSE) 
d 
vet Uv cim UE) 
) 
term () /* term */ 
d 
if (!factar()) returr (FALSE) 
; 
while (mare fcts()) 
5 
returm( TRUE) ş 
} 
mare fets() /* mare factars */ 
{ 


int mldp-bufp, lirnep-lire na; 


if ('mul ap) gata guit; 


rel acirip (7) gota quit; 


return (TRUE) ; 

quit: bufpzaldp; rnextchzbuftbufpl; line _no=linep; 
return (FALSE) ; 

} 


mul mp) /* multiplicaticnal operator */ 
" 
if (match (? x”)) 


3 


else if (match(?” /?)) 


3 
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else ae “naten (7% ) ) 
3 
else return (FALSE) 


3 


return TRUE) ; 


Factor () /* factor */ 
< 
int oldp-bufp, linep-lire noc; 


mt (match? (”)) 

d 
if (!art_expr()) goto quit; 
mc match" »)*)) gato quit; 

else if (func cali()) 

else if (crst expr()) 

else if (char def()) 

else if (iricr_stmt()) 

else if (ivalue()) 

else goto quit 

retunn( TRUE) ; 


quit: bufpzaldp; nextch=bufCbufod 5 line _no=linep; 
return (FALSE) ; 


+ 
crist_expr() /* caristant expression */ 
{ 

if ( censtant()) 

5 

else if (!enst_id()) 

retunrnt(FaLsE);j; 

retire (TRUE) ; 

} 
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lvaluet) /* left value */ 


t 
irit aldpzbufp, linep=line_na; 
char prrthsis=FALSE; 
it (məkbent” (7272 prrthsis=TRUEj; 
if (array_elm()) 
3 
else if (id()) 
3 
else if (pntr expr()) 
: 
else : goto quit 
ј 
if (prnthsis) 
if <'matchm«(?) 77 gata quit; 
return (TRUE) ; 
GULE: bufp=a1dp; nextch=buf Cbufpd 5 line nazlınep; 
return (FALSE) ; 
+ 
array _elm() /* array element */ 
4 | 
int oldpebufp, lirepeliriee yiciş 
if (!id()) gata quit; 
if ('index()) gota quit; 
return (TRUE) ; 
adın. bufp-aldp; rextch=buf [bufpl; line _na=lineps 
return (FALSE) ; 
+ 
index () /* index expression for arrays  */ 
4 
int wldp=bufp, linepzlirne noc; 
if <'maten« 1-7) gata quits 


if (art_expr()) 


3 
ır (mate Ли) gota quit; 
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return (TRUE) ; 
quyu. bufp=aldp; nextehzbuflbufpi; line mazlirep; 
returr (FALSE) ; 


> 
primary () /* primary express1ar 27 
x 
if (cnst expr()) 
— if (array elmt()) 
a ipe id) 
ee if (char_def()) 
Sch pe(sirıimmg()) 
5 return (FALSE) 
3 
return( TRUE) ; 
} 
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AFFENDIX C 


TERMINALS AND BASIC NONTERMINALS 


ısaltr (oc) /* is a letter? 37. 
int c /* character ta test */ 
{ 
return (( ©)2'A' && c (517! ) 114 ( с)=?ајв && cd(m”z?” ))ş 
> 
iscapch (c) /* 18 a capital letter? */ 
int = /* character to test 7 
{ 
return ( œ ='A! && c(=1!7!); 
. 
isadat (c) /* 15 a digit? */ 
int e /* character to test */ 
{ . 
return ( c)c?ğ? ğa cd(cm9? ); 
n 
1sidch(c) /* 16 identifier character? */ 
int Es 
{ 
returnt isaltríic) II isaąadgt (c) || c==?_? ); 
1 E 
delimiter() /* is next-character a delimiter? */ 
4 
return nextehzz '—' || nextch== ? (? || nextch== "a" 
nextch== '*' || nextchz- ?-—” || nextch== 717 
nextch== ?*ж? || пехесҺһ== * /* || nextch== '&? 
mexteh== 757’ Il nextch== ',"? ıı nextchz-2 ? :? 
nextch== 7)? || nextch== 7”? || nextch== 7!) 
nextch== "ai || nextch== ' © || пехбсҺ== ' )” 
rnextchz- ?(? || nextch== ? 3? || пехебсҺһ==?,? 
whtchr (nextch) )us 
} 
uhtchr (c) /* 185 a white-character? */ 
int Cs /* character to test */ 
4 
return (с==? * || с==ТВ8В 11 c==CR 14 с==Е ) 
+ 


Ja 


emar der () Zee 1s Character definitiarn>” #7 


* 
char blank=FALSE; Www clear var. for blanks */ 
delwht (); /* skip white characters ES 
/* character definition should start with "*'" character =. 
ET UUMmEtch0NvNx'*)) Keni” TELE? 


3 
/* consume the following white characters, if there is any  */ 


while ( whtchr(nextch) ) 
€ 


nexta = petcehrí); 
blank = TRUE; 


D riexteh--'àa^ 'À) 
x 
7 cacas if character body is empty 7 


155 C Blank 7 : 
/* illegal character definition */ err msg(ICDF) 


3 


/* else it is a Blank character */ 
else 
£ 
nextch=getchr(); 
returr (TRE): 


+ 


/* if met NA” character, parse one more */ 
EN-( nextch-2-2'NN* ) nextch=getchr () 


3 


/* parse the original character */ 
nextch=getchr () ; 


/* should finish with "2" character! */ 


if ( seh tt” ,”” )) 
/¥ ilegal character def nəiticn */ err msi ICDF) 


s 
5 


retin (TRUE) 3 


gi 


string) /* is string? */ 
{ 


char E /* index variable */ 


delwht (9 $ /* skip white characters 367. 


/* string must start with ?"”? character */ 
if (C'matcho'"*»» return (FALSE) 


3 


/* since strings not implemented in Tiny-C, just consume it */ 


far (i203 ( mextchim?"?) AR ¢ 1¢=MXSTR D5 +41 D 
nextch=getchr () 5 


/* check if it is too lorng */ 


if ( i >) MXSTR 
/* string length tao long */ err_msg (SLTL) 


/* should finish with 1"? character */ 
matentc* s 


return (TRUE) ; 


ceanstant () /* integer constant */ 


{ 
char 1=4; /* index variable */ 
deluht (9 /* skip white characters */ 
/* it should start with a digit */ 
if ('isadgt(nextch)) return (FALSE) 
? 
uhile (isadnti(rextch)) /* parse the number */ 
{ 
/* check if number length e too Long */ 


if ¢1)>=MXNML) 
/* number length too Long */ err_msg (NLTL) 


4 
else num name Cit+j=nextch 


3 


Ww 
Tu 


nextch=getchr () ; 
+ 


if (nextch!=" ” && 'delimiter() ) 
/* delimiter was expected */ err msg (DWEX) 


9 


nun rametile” 7. 


/* convert string "num name" into numeric value */ 
Str numí); 


/* add number into constant table */ 
add. rıum() ş 


retunr (TRUE) ş 


+ 

crst_id() /* carstarnt identifier */ 

4 
ine cildp, linepş 
char 1=4; /* index variable */ 
delwht(); /* skip white characters x / 
cv ldpebufpş linep=line_rics 
nextch=bufCbufpd ; 

/* First character should be a capital letter X 


if ('iscapch(nextch) ) retitrr (FALSE) 


7 


while (iscapch(nextch) ) 
1 


/* check if identifier length is too long */ 


if (i)-MXIDL) 
/* identifier length toa long */ warning (ILTL) 


3 
else 


3 


id nameLi-t-t]-nextch 


nextch=getchr () ; 
m 


/* if following character is still a letter, it can be a lower 
letter only. since Tiny-C assumes constant identifiers are 
all capital letters, this cannot be a constant identifier */ 


d 
CJ 


if (KCisaltr (mert) gato quit 


? 


if (nextch!=" * Феи ии иб ет ta 
/* delimiter was expected */ err _ msg (DWEX) 


3 


id _nameCild=" 73 
nvetunr: (TRUE) ş 


/* backtrack en the scarmer buffer, and return FALSE */ 
quit: . bufpzaldp; line nazlirnep; nextch=buf Cbufpd ; 
return (FALSE) ; 
} 
id () /* is identifier? SE 
{ 
char 1=4; /* index variable */ 
delwhtí); /* skip white characters */ 


/* should start with a letter */ 
if ('isaltr(nextch) ) return (FALSE) 


3 


while (isidchírnextch)) 
E 


/* check if identifier length is toa lang * / 


if (125 -2MXIDL) 
/* identifier length toa larg */ warning CILTL) 


3 
else 1d namelittiznextech 


3 


nextch=getchr () 3 
D 


/* following character must be a delimiter! */ 
if (nextch!=" * && 'delimiter() ) 


/* delimiter was expected */ err_msg(DWEX) 


3 
id nameL[il-2* ?, 


./* if identifier is a reserved word, give errar message */ 


34 


MM MV test () > 
/* reserved word not expected 


return TRUE) ; 


*/ 


err msn (RVNE) 


APPENDIX D 


TINY—C COMPILER ERROR AND WARNING MESSAGES 


Error Messages: 


#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#defire 
#define 
#define 
#define 
#define 
#define 
#define 
#define 
#defire 
#defire 
#define 


AVNI 
SVNI 
EVNI 
RVNI 
SME X 
LINI 
UINI 
FENI 
DENI 
IDE X 
IRSR 
ROBE 
CINI 
EXE X 
L CBE 
LPEX 
REE X 
TEAC 
EEAC 
PTNI 
ARNI 
IFNI 
FTEX 
IVFD 
RCBE 
PREX 
FREC 
ВЕАС 
L DEI 
L DEN 
LPEF 
L DEG 
ILEI 
ILEW 
ICEF 
WMF D 
SSF I 
Sore 
SSFW 
Son 
SSFD 
SMIF 
IAES 
CSMS 
CLIM 


- YON MUO LW fe 


emer 
E Tur S 


a 
Cn 


pe a pp 
VU DJ Ui 


Fi Һә fis fu fa fa CA fis ә 


ЫЈ О :-4 (ОР СЛ D w E fi e 


CA C CJ CJ CJ GJ C) GJ (GJ A CJ (CJ TU 
4 u o-Jtotn (QJ [à c c 


E 
S 


48 
S 
v 


44 


/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/ x 
/ x 
/ 
/ 
/ = 
/= 
/* 
/ x 
/ x 
/ x 
/ x 
/ x 
/ x 
/* 
/* 
/* 
/* 
/* 
/* 
/* 


auto variables not implemented */ 
static variables not implemented x 
external variables not implemented */ 
register variables not implemented */ 
semicolon was expected Lü 
long integers not implemented x. 
unsigned integers not implemented 2? 
Floating points not implemented x^ 
double precisicns nat implemented x 
identifier was expected d 
index bady was supposed to be blank  */ 
right square bracket was expected eS 
cempound initializers not implemented*/ 
expression was expected */ 
left curly bracket was expected */ 
left parenthesis was expected */ 
right parenthesis was expected */ 
identifier was expected after comma */ 
expression was expected after comma  */ 
pointers not implemented x7 
arrays not implemented * UM 
include preprecsr rot implemented 36 / 
filetype was expected 36 / 
invalid file defirnitiler */ 
right curly bracket was expected */ 
parameter was expected 36 / 
parameter expected after comma */ 
an assignment expected after comma - 
left parenthesis expected after if A 
left prrthsis. expected after while  */ 
left parenthesis expected after far */ 
left prrthesis. expected after switch*/ 
illegal logic expression in if */ 
illegal logic expression irn while */ 
illegal logic expression in far */ 
while is missing from du */ 
a statement should follow after if */ 
a statement should fallow after else */ 
a statement should fallow after while*/ 
a statement should follow after for */ 
a statement should follow after dü x7 
semicalon is missing in far */ 
illegal arith. expression in switch  */ 
case statement is missing 5 
colon 15 missing */ 


26 


#define ICEC 45 /* invalid cemstant exprs. after case x 


#define AENI 46 /* address expression nat implemented ки 
#define OCNI 47 /* ane’s camplement nat implemented */ 
#define ЕОМТ 48 /* bitwise aperatars not implemented */ 
#define SENI 49 /* shift expressians nat implemented */ 
define INFE zu /* invalid pointer expression */ 
#define INAE ei /* invalid address expressicm ie 
#define UNVR Eu /* unknown variable 7 
#define IAEI 53 /* invalid arith. expr. in array index */ 
#define RVNE ac /* reserved word not expected gë 
#define ILFE 37 /* illegal function bady — 
#derine CEFR 38 /* input couldr?t be parsed */ 
#defirne ТЕВР 9g /* too big black to parse x 
#define UEOF eu /* unexpected end af file 2 / 
#define CETL El /* camment endless cr too long 3e / 
#define SETL 62 /* string 1S endless ar too lang 5” 
#defire ЈМРН 63 /* wurmatched parenthesis */ 
#define SMUF 64 /* semicaloan missing/unmatched prnthesis*/ 
#define CIE: 63 /* constant identifier expected */ 
#define CVEX 66 /* constant value expected */ 
#define STIF e /* symbal table is full p 
#defire NLIL 68 /* mümeric length too lang */ 
#define TENV 69 /* tac big numeric value 7 
#define NSIF Tü /* name string is full 7 
#defire DTIF gı /* definitimr table is full 5” 
#define CTIF ZE /* canstant table is full */ 
#define VSIF 73 /* variable string is full ”— 
#define LTIF 74 /* label table is full */ 
#defire DCID .. /* duplicated cans. id declaraticn */ 
#defirne DL DC 76 /* duplicated label declaratlerırı m 
#defire ICDF 7 /* illegal character definition */ 
#define SLTL 78 /* string length tac lang */ 
#defire DWEX Ke: /* delimiter was expected ny, 
#define UDLE au /* undeclared label */ 
#defime DPDC 81 /* duplicated parameter declaratixcn */ 
#defirne DFFA ai /* declared parameter is rot a fun. arg. */ 
#defire UNPE BS /* undeclared parameter exists * / 
#define DFDC 84 /* duplicated function declaration */ 
#define ICAN 86 /* incemsistent argument number */ 
#defire DDDC 87 /* duplicated default declaration | 
#define IVER 88 /*® invalid break usage a / 
#defire TMNL 83 /* too many nested level m 


Warning Messages: 


#define AFRI 
#define CSIR 
#define ILTL 
#define TOMF 


/* all functions return integer */ 
/* compound statement is blank  */ 
/* identifier length tao long */ 
/* main function is missing */ 


> бә) Го 
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#define 
#defire 
#define 
#define 
#define 
Hdefirie 
#define 
#define 
#define 
#defirıe 
#defirne 
#define 
#define 
#define 
#define 
#defire 
#define 
Hhdefire 
#define 
#defire 
#defire 
#define 
#defire 
tdefirıe 
#defire 
#define 
#define 
#define 
#defire 
tdefirie 
#defire 
#defirne 
#define 
#defire 
#define 
#define 
#defire 
#define 
#define 
Hdefire 
Hdefine 
#define 
#defire 
#define 
#defire 
#define 
#defire 
#defire 


APPENDIX E 


INTERMEDIATE CODE DEFINITIONS FOR TINY-C 





IADD 1 /* integer addition x 
ISE 2 /* integer subtractiaon  */ 
IMUL 5 /* integer multiply */ 
IDIV 4 /* integer division */ 
MDLS 5 /* integer modulus */ 
IMLE © /* label declaration */ 
JUMF 7 /* wncanditicmal jump */ 
JFTR 8 /* jump if true */ 
ЈРИ 93 /* jump if false x 
LGEX 14 /* logic expressicn */ 
CONS 11 /* constant x 
ARID iz /* array identifier 36A 
VARE T /* variable с 
UNMS 14 /* unary minus 2074 
LGNT 15 /* unary logic noat */ 
EGET 18 /* eguality */ 
NTEQ 19 /* met equal */ 
LSTN 22 /* less tharı — 
GRIN 21 /* greater than */ 
LTEG Gee /* less than or equal */ 
bip 23 /* greater than ar equal*/ 
LGAN 24 /* logic and */ 
LGOR 25 /* logic ur m 
ASSN ae /* assignmerit */ 
ADAS 27 /* additicrn-assigrnment */ 
SBAS 28 /* Subtracticrn-assign */ 
MLAS 29 /* multiply-assignment  */ 
DVAS 22 /* divislar—assigrment  */ 
MDAS 1 /* madulus-assignmernt */ 
Eb = LIADO An a del */ 
ARGM 33 /* argument */ 
EXE 3 /* explicit label */ 
GOTO 3J /* Jump to exp. label */ 
CASE 6 /* case statement ” 
TVAR 37 /* temporary variable # */ 
SWTC 38 /* switch statemerit */ 
PNTR 39 /* pointer */ 
ADDR 4a /* address SE 
RTRN 41 /* return */ 
INDX 42 /* index */ 
FNDC 45 /* function declaration */ 
STMT 46 /* statemert */ 
DUMY 47 /* dummy statement */ 
EREK 48 /* break statemerit */ 
DELT 49 /* default case */ 
INCR Sü /* increment */ 
DCRT S1 /* decrement */ 
INCL c /* increment, later */ 
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#define 
#define 
#define 
#defirıe 
#define 
#defire 
#define 
#define 


DERC 
NOOF 
TEME 
CNVE 
ENER 
FEND 
MATN 
MEND 


— 
/ x 
/ x 
/ x 
/ = 
/* 
/ * 
/* 


decrement, later 

ra operatian 
temporary variable 
Convert ta boolear 
boolean temporary 
function end 

mair funcet ior 

end of main furctları 


ite) 
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*/ 
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AFFENDIX_F 


TEST PROGRAMS FOR THESEENYC COMBATE E 


Program 1. 


mairi() 
5 
int joe, jimmy; 
joe-2225; 
jimmyzis; 
switch ( joe * 5) 
{ 
case lo $ *tjwe; 
breaks 
default : jae-2jimmy-cz7; 
break; 
Case 14 : joe- Jimmy--; 
break; 
2 
> : 
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The Code far 


codeseg eq! 
dataseg equ 


symi 
Sym 


mair: 


wm BIZ: 


1 nib 1.5: 


imlbl4: 


ımlblü: 


Tul = 


org 
ds 
ds 


Car q 
jmp 


nave 
nave 
fno v 
mul 
Mi Ve 
Mave 
move 
jmp 


nuove 
add 
Mv e 
jmp 


Mave 
have 
add 

Mave 


jmp 


Mave 
sub 
move 
Mv e 
jmp 


nave 
nave 
isf 
move 
rt 
jmp 


Stop 


the Eruz Tus 





(iis Zi) 
(1:0) 
dataseg 
1 

1 


cadeseg 
main 


sancı. 5) 
Cusco lor. rone l) 
Fu.) 
Pasa), rias) 
r(4d:4),syml 
r(4:1), syma 
PCM), SYS 
imlblu 


symi,r(4:4) 

e, A ne) 
nia: 1 ges yond 

imlbli 


time; ey al) 
sym 1) 

Tə 1) A A) 
r(4:4), syml 

dinl BU 


вутс rm (2202) 

Conni, (7:42: el) 
r (2:02), symi 

r (4210. S 

imlbli 


sym3, r (41:14) 

{int la, r (VIr) 
r(a:4)==r(4M:1),implbl 
MR, e. rr 0032) 
r(a4:4)==r (4:23), implbl4 
implbl3 
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Proqram uz. 


mair:() 
t 
int joe, jimmy; 
Јое=3: 
jimmy-joex37s 
if ( joe > 37 && Jjimmy(-2joe && Jimmy ) 
{ 
Jimmy=18; 
> 
else 
jJimmy=c7 ; 
jre=jinmyt+es; 
x 
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ime Cade for the Fragram c. 


ceodeseg equ 
dataseg equ 


symi 
syme 


main: 


blblü: 


bibli: 


ogle: 


pao lS 


Ы1Ы148: 


BlElo: 


imlblü: 


imlbl1: 


tr” q 
ds 
ds 


zın 
jmp 


nave 
TMR e 
mel 
nove 
laf, 
Mave 
jmp 


nave 


jef 
nove 
jmp 


move 


and 
ef 
move 
jmp 


nove 


and 
nove 
nove 
er 
mave 
mave 
jmp 


(2:0) 
(1:0) 
dataseg 
1 
1 


cadeseg 


main 


C988 vi, Cüzi) 

t 3e 57, 0B: 1) 
МК, (О) 
ама, 5 / kamt vi e 
TUZ pas), bibl 
¿pom talip) ,r( 0:5) 
Bilimi 


ube) SEELEN, r UA: 2) 


roğşi ) < rağ), bibl 
Фуз01 "Ра Гбе ТИ (wr) 
btia 


(¿becl, true), r (0: 4) 


r Game). rui: ^) 

rM) -2wmnt,u0..bfT'6t14 
bek, teme m cd:5) 
bb 


(peel, facie}, r (075) 


ri), ve) 

"AED. symi 

r (Uli), syme 

ri; >s)==tbocl, false), Mb II: 
tim Lek, OUI) 

ce 2:02), symu 

linm 


tit ez >, vu) 
nc (2:02), symüe 


i rete», Oma) 
symnc,ro1) 
r (4: 1), PORN 


Pradrala 5. 


functicn(joe, Jimmy) 
int joe, jimmy; 


s 
d CH 
joe < Jimmaytt;ş 
while ( jue == 2); 
—— 3 10” 5 
+ 
main) 
{ 
int joe, jimmy; 
jimmy=S; 
d Tt 
joe = Jjimmy--; 
while ( jue == 3); 
furet icr ( jimmy, jce); 
++ jimmy ; 
+ 


124 


The Code far 


codeseg equ 
dataseg equ 


syi 
sym 
sym4 
вутб 


mı” O 
ds 
ds 
ds 
ds 


ye O 
jmp 


Ffunctians: 


imlblü: 


bibla: 


BİLİ: 


imlbli: 


mair: 


unit Ole = 


pap 
pap 
Mü Ve 
nave 


nove 
add 
move 
aet 
MC Ve 
Jump 


MOVE 


move 
nove 
17 


nove 
sub 
push 
müs 


nave 
nave 
nove 


mave 
sub 
nove 
ir 
move 
Jump 


MOVE 


ne Emra 





(20:03 
oLa 
dataseg 
il 


1 
1 
l 


cadeseg 
ma iri 


s(4M),r(4:4) 
Ss (UA r (ids: I) 
“(д:002), вут1 
ripa: 199 Symm 


syme, rv (is) 
Tinta y GUS 1) 
IMD) 

r(4:0) ==r(4:2),bl1b14 
tbumluyfalsei, (4422) 
pipi 


1“ bucl: (ede”r, r (022) 


rv 2:0), symi 
rari); Syme 
r(49:3)==tbc01,true)?, imlblu: 


sym2,r (4:14) 

{int ay (1) 
{irit D 5 uU» 

s(1) 


{init ese, rege) 
r(4d:4),syms 
*(2:1), зуи2 


symeö, 2:0) 

time ra rom.) 
me, or Gs) 
r(0:0)=r (02), bDlblž 
{bool, false}, r(a: 3) 
Dlbl3 


{poal. trues, rss) 


bip ee 


Mave r(a: 4), sym4 

nive ve(ğ:li),symoö 

if y (030 x boocl, true}, imlblē: 
ra Drg: 

mave symA, (A: t) 

push r (zi, Stéi 

mave syme,r 2:1) 

push KEE ecd) 

jsr furcall,s(1) 

pap 5 (2),“*4(д:2) 

ааа times St EIER 

stop | 
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